how to plot on a graph using the hypothesis function by substituting the value of theta 0 and theta 1 - machine-learning

this is the hypothesis function h(x)=theta 0 + theta 1(x)
After putting the value of theta 0 as 0 and theta 1 as 0.5, how to plot it on a graph?

It is the same way that we graph the linear equations. Let us assume h(x) as y and θ as some constant and x as x. So we basically have a linear expression like this y = m + p * x (m,p are constants) . To even simplify it assume the function as y = 2 + 4x. To plot this we will just assume the values of x from a range (0,5) so now for each value of x we will have corresponding value of x. so our (x,y) set will look like this ([0, 1, 2, 3, 4], [2, 6, 10, 14, 18]). Now the graph can be plotted as we know both x and y coords.

You simply plot the line equation y = 0 + 0.5 * x
So you get something like this plot
Here's how I did it with Python
import matplotlib.pyplot as plt
import numpy as np
theta_0 = 0
theta_1 = 0.5
def h(x):
return theta_0 + theta_1 * x
x = range(-100, 100)
y = map(h, x)
plt.plot(x, y)
plt.ylabel(r'$h_\theta(x)$')
plt.xlabel(r'$x$')
plt.title(r'Plot of $h_\theta(x) = \theta_0 + \theta_1 \cdot \ x$')
plt.text(60, .025, r'$\theta_0=0,\ \theta_1=0.5$')
plt.show()

Related

include_bias in Polynomial Regression

I'm training a polynomial regression model after adding polynomial features with include_bias=True
X = 6 * np.random.rand(100, 1) - 3
y = 0.5 * X**2 + X + 2 + np.random.randn(100, 1)
from sklearn.preprocessing import PolynomialFeatures
poly_features = PolynomialFeatures(degree=2, include_bias=True)
X_poly = poly_features.fit_transform(X)
from sklearn.linear_model import LinearRegression
lin_reg = LinearRegression()
lin_reg.fit(X_poly, y)
print(lin_reg.intercept_, lin_reg.coef_)
[1.95551099] [[0. 1.07234332 0.5122747 ]]
Question: include_bias essentially adds another feature vector of 1s, for intercept parameter (theta0). so, Im expecting intercept 1.9555 in place of 0. Why does this return 0 for theta0?
[[0. 1.07234332 0.5122747 ]]

Logistic Regression not able to find value of theta

I have hundred Entries in csv file.
Physics,Maths,Status_class0or1
30,40,0
90,70,1
Using above data i am trying to build logistic (binary) classifier.
Please advise me where i am doing wrong ? Why i am getting answer in 3*3 Matrix (9 values of theta, where as it should be 3 only)
Here is code:
importing the libraries
import numpy as np
import pandas as pd
from sklearn import preprocessing
reading data from csv file.
df = pd.read_csv("LogisticRegressionFirstBinaryClassifier.csv", header=None)
df.columns = ["Maths", "Physics", "AdmissionStatus"]
X = np.array(df[["Maths", "Physics"]])
y = np.array(df[["AdmissionStatus"]])
X = preprocessing.normalize(X)
X = np.c_[np.ones(X.shape[0]), X]
theta = np.ones((X.shape[1], 1))
print(X.shape) # (100, 3)
print(y.shape) # (100, 1)
print(theta.shape) # (3, 1)
calc_z to caculate dot product of X and theta
def calc_z(X,theta):
return np.dot(X,theta)
Sigmoid function
def sigmoid(z):
return 1 / (1 + np.exp(-z))
Cost_function
def cost_function(X, y, theta):
z = calc_z(X,theta)
h = sigmoid(z)
return (-y * np.log(h) - (1 - y) * np.log(1 - h)).mean()
print("cost_function =" , cost_function(X, y, theta))
def derivativeofcostfunction(X, y, theta):
z = calc_z(X,theta)
h = sigmoid(z)
calculation = np.dot((h - y).T,X)
return calculation
print("derivativeofcostfunction=", derivativeofcostfunction(X, y, theta))
def grad_desc(X, y, theta, lr=.001, converge_change=.001):
cost = cost_function(X, y, theta)
change_cost = 1
num_iter = 1
while(change_cost > converge_change):
old_cost = cost
print(theta)
print (derivativeofcostfunction(X, y, theta))
theta = theta - lr*(derivativeofcostfunction(X, y, theta))
cost = cost_function(X, y, theta)
change_cost = old_cost - cost
num_iter += 1
return theta, num_iter
Here is the output :
[[ 0.4185146 -0.56877556 0.63999433]
[15.39722864 9.73995197 11.07882445]
[12.77277463 7.93485324 9.24909626]]
[[0.33944777 0.58199037 0.52493407]
[0.02106587 0.36300629 0.30297278]
[0.07040604 0.3969297 0.33737757]]
[[-0.05856159 -0.89826735 0.30849185]
[15.18035041 9.59004868 10.92827046]
[12.4804775 7.73302024 9.04599788]]
[[0.33950634 0.58288863 0.52462558]
[0.00588552 0.35341624 0.29204451]
[0.05792556 0.38919668 0.32833157]]
[[-5.17526527e-01 -1.21534937e+00 -1.03387571e-02]
[ 1.49729502e+01 9.44663458e+00 1.07843504e+01]
[ 1.21978140e+01 7.53778010e+00 8.84964495e+00]]
(array([[ 0.34002386, 0.58410398, 0.52463592],
[-0.00908743, 0.34396961, 0.28126016],
[ 0.04572775, 0.3816589 , 0.31948193]]), 46)
I changed this code , just added Transpose while returning the matrix and it fixed my issue.
def derivativeofcostfunction(X, y, theta):
z = calc_z(X,theta)
h = sigmoid(z)
calculation = np.dot((h - y).T,X)
return calculation.T

What does the distance in machine learning signify?

In Neural networks, we regularly use the equation:
w1*x1 + w2*x2 + w3*x3 ...
We can interpret this as equation of a line with each x as a dimension. To make things more clearer, lets take an example of a simple perceptron network.
Imagine a single layer perceptron with 2 ip's/features (x1 & x2) and one output (y). (Sorry stackoverflow didn't allow me to post an additional image)
Let
R = w1*x1 + w2 * x2
y = 0 if R >= threshold
y = 1 if R < threshold
Scenario 1:
Threshold = 0
w1 = 2, w2 = -1
The line separating class 0 and 1 has the equation 2*x1 - x2 = 0
Suppose we get a test sample
P = (1,1)
R = 2*1 - 1 = 1 > 0
Sample P belongs to class 1
My questions is what is this R?
From the figure, its horizontal distance from the line.
Scenario 2:
Threshold = 0
w1 = 2, w2 = 1
The line separating class 0 and 1 has the equation 2*x1 + x2 = 0
P = (1,1)
R = 2*1 + 1 = 3 > 0
Sample P belongs to class 1
From the figure, its vertical distance from the line.
R is supposed to mean some form of distance from the classifying line. More the distance, more farther away from the line and we are more confident about the classification.
Just want to know what kind of distance from the line is R?

How do I implement the optimization function in tensorflow?

minΣ(||xi-Xci||^2+ λ||ci||),
s.t cii = 0,
where X is a matrix of shape d * n and C is of the shape n * n, xi and ci means a column of X and C separately.
X is known here and based on X we want to find C.
Usually with a loss like that you need to vectorize it, instead of working with columns:
loss = X - tf.matmul(X, C)
loss = tf.reduce_sum(tf.square(loss))
reg_loss = tf.reduce_sum(tf.square(C), 0) # L2 loss for each column
reg_loss = tf.reduce_sum(tf.sqrt(reg_loss))
total_loss = loss + lambd * reg_loss
To implement the zero constraint on the diagonal of C, the best way is to add it to the loss with another constant lambd2:
reg_loss2 = tf.trace(tf.square(C))
total_loss = total_loss + lambd2 * reg_loss2

Linear Regression (Gradient descent update) - training set err is more than testing

My algorithm is like:
data is stored as:
data = [record1, record2, ... ]
where record1 is [1, x1, x2 ..., x_m] m feature values for that record
theta is parameter of linear regression function, theta is vector of size m+1
y is true label, again array of length, len(data). (y[0] is true value for record 0)
Linear regression Stochastic update:
while True:
for i in range(len(data)):
x = data[i]
for t in range(0, m):
theta[t] = theta[t] - my_lambda * (np.dot(theta, x) - y[i]) * x[t]
j_theta = compute_J_of_theta(data, y, theta)
print "Iteration #: ", iterations, " j_theta ", j_theta
if j_theta < 5000:
#print "******************** FINALLY CONVERGED!!!! ********************"
break
compute_j_of_theta(data, y, theta):
"""
Convergence criteria,
compute J(theta) = 1/2M sum (h_theta(x_t) - y_t)**2
"""
temp = 0
for i in range(0, len(data)):
x = data[i]
temp += (np.dot(theta, x) - y[i])**2
return temp/2*M
my_lambda is very small
Initially theta is 0 vector of size m+1
Que: Training set err is more than testing ... WHY? what's wrong with this ?
EDIT 1:
It was my stupid mistake in calculating err

Resources