How do I optimize this code so that it executes faster (max ~5 mins)? - python-turtle

I am trying to make a program in turtle that creates a Lyapunov fractal. However, as using timeit shows, this should take around 3 hours to complete, 1.5 if I compromise resolution (N).
import turtle as t; from math import log; from timeit import default_timer as dt
t.setup(2000,1000,0); swid=t.window_width(); shei=t.window_height(); t.up(); t.ht(); t.tracer(False); t.colormode(255); t.bgcolor('pink')
def lyapunov(seq,xmin,xmax,ymin,ymax,N,tico):
truseq=str(bin(seq))[2:]
for x in range(-swid//2+2,swid//2-2):
tx=(x*(xmax-xmin))/swid+(xmax+xmin)/2
if x==-swid//2+2:
startt=dt()
for y in range(-shei//2+11,shei//2-1):
t.goto(x,y); ty=(y*(ymax-ymin))/shei+(ymax+ymin)/2; lex=0; xn=prevxn=0.5
for n in range(1,N+1):
if truseq[n%len(truseq)]=='0': rn=tx
else: rn=ty
xn=rn*prevxn*(1-prevxn)
prevxn=xn
if xn!=1: lex+=(1/N)*log(abs(rn*(1-2*xn)))
if lex>0: t.pencolor(0,0,min(int(lex*tico),255))
else: t.pencolor(max(255+int(lex*tico),0),max(255+int(lex*tico),0),0)
t.dot(size=1); t.update()
if x==-swid//2+2:
endt=dt()
print(f'Estimated total time: {(endt-startt)*(swid-5)} secs')
#Example: lyapunov(2,2.0,4.0,2.0,4.0,10000,100)
I attempted to use yield but I couldn't figure out where it should go.

On my slower machine, I was only able to test with a tiny N (e.g. 10) but I was able to speed up the code about 350 times. (Though this will be clearly lower as N increases.) There are two problems with your use of update(). The first is you call it too often -- you should outdent it from the y loop to the x loop so it's only called once on each vertical pass. Second, the dot() operator forces an automatic update() so you get no advantage from using tracer(). Replace dot() with some other method of drawing a pixel and you'll get back the advantage of using tracer() and update(). (As long as you move update() out of innermost loop as I noted.)
My rework of your code where I tried out these, and other, changes:
from turtle import Screen, Turtle
from math import log
from timeit import default_timer
def lyapunov(seq, xmin, xmax, ymin, ymax, N, tico):
xdif = xmax - xmin
ydif = ymax - ymin
truseq = str(bin(seq))[2:]
for x in range(2 - swid_2, swid_2 - 2):
if x == 2 - swid_2:
startt = default_timer()
tx = x * xdif / swid + xdif/2
for y in range(11 - shei_2, shei_2 - 1):
ty = y * ydif / shei + ydif/2
lex = 0
xn = prevxn = 0.5
for n in range(1, N+1):
rn = tx if truseq[n % len(truseq)] == '0' else ty
xn = rn * prevxn * (1 - prevxn)
prevxn = xn
if xn != 1:
lex += 1/N * log(abs(rn * (1 - xn*2)))
if lex > 0:
turtle.pencolor(0, 0, min(int(lex * tico), 255))
else:
lex_tico = max(int(lex * tico) + 255, 0)
turtle.pencolor(lex_tico, lex_tico, 0)
turtle.goto(x, y)
turtle.pendown()
turtle.penup()
screen.update()
if x == 2 - swid_2:
endt = default_timer()
print(f'Estimated total time: {(endt - startt) * (swid - 5)} secs')
screen = Screen()
screen.setup(2000, 1000, startx=0)
screen.bgcolor('pink')
screen.colormode(255)
screen.tracer(False)
swid = screen.window_width()
shei = screen.window_height()
swid_2 = swid//2
shei_2 = shei//2
turtle = Turtle()
turtle.hideturtle()
turtle.penup()
turtle.setheading(90)
lyapunov(2, 2.0, 4.0, 2.0, 4.0, 10, 100)
screen.exitonclick()

Related

Implementing linear regression from scratch in python

I'm trying to Implement linear regression in python using the following gradient decent formulas (Notice that these formulas are after partial derive)
slope
y_intercept
but the code keeps giving me wearied results ,I think (I'm not sure) that the error is in the gradient_descent function
import numpy as np
class LinearRegression:
def __init__(self , x:np.ndarray ,y:np.ndarray):
self.x = x
self.m = len(x)
self.y = y
def calculate_predictions(self ,slope:int , y_intercept:int) -> np.ndarray: # Calculate y hat.
predictions = []
for x in self.x:
predictions.append(slope * x + y_intercept)
return predictions
def calculate_error_cost(self , y_hat:np.ndarray) -> int:
error_valuse = []
for i in range(self.m):
error_valuse.append((y_hat[i] - self.y[i] )** 2)
error = (1/(2*self.m)) * sum(error_valuse)
return error
def gradient_descent(self):
costs = []
# initialization values
temp_w = 0
temp_b = 0
a = 0.001 # Learning rate
while True:
y_hat = self.calculate_predictions(slope=temp_w , y_intercept= temp_b)
sum_w = 0
sum_b = 0
for i in range(len(self.x)):
sum_w += (y_hat[i] - self.y[i] ) * self.x[i]
sum_b += (y_hat[i] - self.y[i] )
w = temp_w - a * ((1/self.m) *sum_w)
b = temp_b - a * ((1/self.m) *sum_b)
temp_w = w
temp_b = b
costs.append(self.calculate_error_cost(y_hat))
try:
if costs[-1] > costs[-2]: # If global minimum reached
return [w,b]
except IndexError:
pass
I Used this dataset:-
https://www.kaggle.com/datasets/tanuprabhu/linear-regression-dataset?resource=download
after downloading it like this:
import pandas
p = pandas.read_csv('linear_regression_dataset.csv')
l = LinearRegression(x= p['X'] , y= p['Y'])
print(l.gradient_descent())
But It's giving me [-568.1905905426412, -2.833321633515304] Which is decently not accurate.
I want to implement the algorithm not using external modules like scikit-learn for learning purposes.
I tested the calculate_error_cost function and it worked as expected and I don't think that there is an error in the calculate_predictions function
One small problem you have is that you are returning the last values of w and b, when you should be returning the second-to-last parameters (because they yield a lower cost). This should not really matter that much... unless your learning rate is too high and you are immediately getting a higher value for the cost function on the second iteration. This I believe is your real problem, judging from the dataset you shared.
The algorithm does work on the dataset, but you need to change the learning rate. I ran it in the example below and it gave the result shown in the image. One caveat is that I added a limit to the iterations to avoid the algorithm from taking too long (and only marginally improving the result).
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
class LinearRegression:
def __init__(self , x:np.ndarray ,y:np.ndarray):
self.x = x
self.m = len(x)
self.y = y
def calculate_predictions(self ,slope:int , y_intercept:int) -> np.ndarray: # Calculate y hat.
predictions = []
for x in self.x:
predictions.append(slope * x + y_intercept)
return predictions
def calculate_error_cost(self , y_hat:np.ndarray) -> int:
error_valuse = []
for i in range(self.m):
error_valuse.append((y_hat[i] - self.y[i] )** 2)
error = (1/(2*self.m)) * sum(error_valuse)
return error
def gradient_descent(self):
costs = []
# initialization values
temp_w = 0
temp_b = 0
iteration = 0
a = 0.00001 # Learning rate
while iteration < 1000:
y_hat = self.calculate_predictions(slope=temp_w , y_intercept= temp_b)
sum_w = 0
sum_b = 0
for i in range(len(self.x)):
sum_w += (y_hat[i] - self.y[i] ) * self.x[i]
sum_b += (y_hat[i] - self.y[i] )
w = temp_w - a * ((1/self.m) *sum_w)
b = temp_b - a * ((1/self.m) *sum_b)
costs.append(self.calculate_error_cost(y_hat))
try:
if costs[-1] > costs[-2]: # If global minimum reached
print(costs)
return [temp_w,temp_b]
except IndexError:
pass
temp_w = w
temp_b = b
iteration += 1
print(iteration)
return [temp_w,temp_b]
p = pd.read_csv('linear_regression_dataset.csv')
x_data = p['X']
y_data = p['Y']
lin_reg = LinearRegression(x_data, y_data)
y_hat = lin_reg.calculate_predictions(*lin_reg.gradient_descent())
fig = plt.figure()
plt.plot(x_data, y_data, 'r.', label='Data')
plt.plot(x_data, y_hat, 'b-', label='Linear Regression')
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.show()

How to find and update levels accordingly based on points?

I am creating a rails application which is like a game. So it has points and levels. For example: to become level one the user has to get atleast 100 points and again for level two the user has to reach level 2 the user has to collect 200 points. The level difference changes after every 10 levels i.e., The difference between each level changes after 10 levels always. By that I mean the difference in points between level one and two is 100 and the difference in points in level 11 and 12 is 150 and so on. There is no upper bound for levels.
Now my question is let's say a user's total points is 3150 and just got updated to 3155. What's the optimal solution to find the current level and update it if needed?
I can get a solution using while loops and again looping inside it which will give a result in O(n^2). I need something better.
I think this code works but I'm not sure if this is the best way to go about it
def get_level(points)
diff = 100
sum = 0
level = -1
current_level = 0
while level.negative?
10.times do |i|
current_level += 1
sum += diff
if points > sum
next
elsif points <= sum
level = current_level
break
end
end
diff += 50
end
puts level
end
I wrote a get_points function (it should not be difficult). Then based on it get_level function in which it was necessary to solve the quadratic equation to find high value, and then calc low.
If you have any questions, let me know.
Check output here.
#!/usr/bin/env python3
import math
def get_points(level):
high = (level + 1) // 10
low = (level + 1) % 10
high_point = 250 * high * high + 750 * high # (3 + high) * high // 2 * 500
low_point = (100 + 50 * high) * low
return low_point + high_point
def get_level(points):
# quadratic equation
a = 250
b = 750
c = -points
d = b * b - 4 * a * c
x = (-b + math.sqrt(d)) / (2 * a)
high = int(x)
remainder = points - (250 * high * high + 750 * high)
low = remainder // (100 + 50 * high)
level = high * 10 + low
return level
def main():
for l in range(0, 40):
print(f'{l:3d} {get_points(l - 1):5d}..{get_points(l) - 1}')
for level, (l, r) in (
(1, (100, 199)),
(2, (200, 299)),
(9, (900, 999)),
(10, (1000, 1149)),
(11, (1150, 1299)),
(19, (2350, 2499)),
(20, (2500, 2699)),
):
for p in range(l, r + 1): # for in [l, r]
assert get_level(p) == level, f'{p} {l}'
if __name__ == '__main__':
main()
Why did you set the value of a=250 and b = 750? Can you explain that to me please?
Let's write out every 10 level and the difference between points:
lvl - pnt (+delta)
10 - 1000 (+1000 = +100 * 10)
20 - 2500 (+1500 = +150 * 10)
30 - 4500 (+2000 = +200 * 10)
40 - 7000 (+2500 = +250 * 10)
Divide by 500 (10 levels * 50 difference changes) and received an arithmetic progression starting at 2:
10 - 2 (+2)
20 - 5 (+3)
30 - 9 (+4)
40 - 14 (+5)
Use arithmetic progression get points formula for level = k * 10 equal to:
sum(x for x in 2..k+1) * 500 =
(2 + k + 1) * k / 2 * 500 =
(3 + k) * k * 250 =
250 * k * k + 750 * k
Now we have points and want to find the maximum high such that point >= 250 * high^2 + 750 * high, i. e. 250 * high^2 + 750 * high - points <= 0. Value a = 250 is positive and branches of the parabola are directed up. Now we find the solution of quadratic equation 250 * high^2 + 750 * high - points = 0 and discard the real part (is high = int(x) in python script).

Lua Separation Steering algorithm groups overlapping rooms into one corner

I'm trying to implement a dungeon generation algorithm (presented here and demo-ed here ) that involves generating a random number of cells that overlap each other. The cells then are pushed apart/separated and then connected. Now, the original poster/author described that he is using a Separation Steering Algorithm in order to uniformly distribute the cells over an area. I haven't had much experience with flocking algorithm and/or separation steering behavior, thus I turned to google for an explanation (and found this ). My implementation (based on the article last mentioned) is as follows:
function pdg:_computeSeparation(_agent)
local neighbours = 0
local rtWidth = #self._rooms
local v =
{
x = self._rooms[_agent].startX,
y = self._rooms[_agent].startY,
--velocity = 1,
}
for i = 1, rtWidth do
if _agent ~= i then
local distance = math.dist(self._rooms[_agent].startX,
self._rooms[_agent].startY,
self._rooms[i].startX,
self._rooms[i].startY)
if distance < 12 then
--print("Separating agent: ".._agent.." from agent: "..i.."")
v.x = (v.x + self._rooms[_agent].startX - self._rooms[i].startX) * distance
v.y = (v.y + self._rooms[_agent].startY - self._rooms[i].startY) * distance
neighbours = neighbours + 1
end
end
end
if neighbours == 0 then
return v
else
v.x = v.x / neighbours
v.y = v.y / neighbours
v.x = v.x * -1
v.y = v.y * -1
pdg:_normalize(v, 1)
return v
end
end
self._rooms is a table that contains the original X and Y position of the Room in the grid, along with it's width and height (endX, endY).
The problem is that, instead of tiddly arranging the cells on the grid, it takes the overlapping cells and moves them into an area that goes from 1,1 to distance+2, distance+2 (as seen in my video [youtube])
I'm trying to understand why this is happening.
In case it's needed, here I parse the grid table, separate and fill the cells after the separation:
function pdg:separate( )
if #self._rooms > 0 then
--print("NR ROOMS: "..#self._rooms.."")
-- reset the map to empty
for x = 1, self._pdgMapWidth do
for y = 1, self._pdgMapHeight do
self._pdgMap[x][y] = 4
end
end
-- now, we separate the rooms
local numRooms = #self._rooms
for i = 1, numRooms do
local v = pdg:_computeSeparation(i)
--we adjust the x and y positions of the items in the room table
self._rooms[i].startX = v.x
self._rooms[i].startY = v.y
--self._rooms[i].endX = v.x + self._rooms[i].endX
--self._rooms[i].endY = v.y + self._rooms[i].endY
end
-- we render them again
for i = 1, numRooms do
local px = math.abs( math.floor(self._rooms[i].startX) )
local py = math.abs( math.floor(self._rooms[i].startY) )
for k = self.rectMinWidth, self._rooms[i].endX do
for v = self.rectMinHeight, self._rooms[i].endY do
print("PX IS AT: "..px.." and k is: "..k.." and their sum is: "..px+k.."")
print("PY IS AT: "..py.." and v is: "..v.." and their sum is: "..py+v.."")
if k == self.rectMinWidth or
v == self.rectMinHeight or
k == self._rooms[i].endX or
v == self._rooms[i].endY then
self._pdgMap[px+k][py+v] = 1
else
self._pdgMap[px+k][py+v] = 2
end
end
end
end
end
I have implemented this generation algorithm as well, and I came across more or less the same issue. All of my rectangles ended up in the topleft corner.
My problem was that I was normalizing velocity vectors with zero length. If you normalize those, you divide by zero, resulting in NaN.
You can fix this by simply performing a check whether your velocity's length is zero before using it in any further calculations.
I hope this helps!
Uhm I know it's an old question, but I noticed something and maybe it can be useful to somebody, so...
I think there's a problem here:
v.x = (v.x + self._rooms[_agent].startX - self._rooms[i].startX) * distance
v.y = (v.y + self._rooms[_agent].startY - self._rooms[i].startY) * distance
Why do you multiply these equations by the distance?
"(self._rooms[_agent].startX - self._rooms[i].startX)" already contains the (squared) distance!
Plus, multiplying everything by "distance" you modify your previous results stored in v!
If at least you put the "v.x" outside the bracket, the result would just be higher, the normalize function will fix it. Although that's some useless calculation...
By the way I'm pretty sure the code should be like:
v.x = v.x + (self._rooms[_agent].startX - self._rooms[i].startX)
v.y = v.y + (self._rooms[_agent].startY - self._rooms[i].startY)
I'll make an example. Imagine you have your main agent in (0,0) and three neighbours in (0,-2), (-2,0) and (0,2). A separation steering behaviour would move the main agent toward the X axis, at a normalized direction of (1,0).
Let's focus only on the Y component of the result vector.
The math should be something like this:
--Iteration 1
v.y = 0 + ( 0 + 2 )
--Iteration 2
v.y = 2 + ( 0 - 0 )
--Iteration 3
v.y = 2 + ( 0 - 2 )
--Result
v.y = 0
Which is consistent with our theory.
This is what your code do:
(note that the distance is always 2)
--Iteration 1
v.y = ( 0 + 0 + 2 ) * 2
--Iteration 2
v.y = ( 4 + 0 - 0 ) * 2
--Iteration 3
v.y = ( 8 + 0 - 2 ) * 2
--Result
v.y = 12
And if I got the separation steering behaviour right this can't be correct.

Gradient Descent Implementation in Python returns Nan

I am trying to implement gradient descent in python; the implementation works when I try it with training_set1 but it returns not a number(nan) when I try it training_set. Any idea why my code is broken?
from collections import namedtuple
TrainingInstance = namedtuple("TrainingInstance", ['X', 'Y'])
training_set1 = [TrainingInstance(0, 4), TrainingInstance(1, 7),
TrainingInstance(2, 7), TrainingInstance(3, 8),
TrainingInstance(8, 12)]
training_set = [TrainingInstance(60, 3.1), TrainingInstance(61, 3.6),
TrainingInstance(62, 3.8), TrainingInstance(63, 4),
TrainingInstance(65, 4.1)]
def grad_desc(x, x1):
# minimize a cost function of two variables using gradient descent
training_rate = 0.1
iterations = 5000
#while sqrd_error(x, x1) > 0.0000001:
while iterations > 0:
#print sqrd_error(x, x1)
x, x1 = x - (training_rate * deriv(x, x1)), x1 - (training_rate * deriv1(x, x1))
iterations -= 1
return x, x1
def sqrd_error(x, x1):
sum = 0.0
for inst in training_set:
sum += ((x + x1 * inst.X) - inst.Y)**2
return sum / (2.0 * len(training_set))
def deriv(x, x1):
sum = 0.0
for inst in training_set:
sum += ((x + x1 * inst.X) - inst.Y)
return sum / len(training_set)
def deriv1(x, x1):
sum = 0.0
for inst in training_set:
sum += ((x + x1 * inst.X) - inst.Y) * inst.X
return sum / len(training_set)
if __name__ == "__main__":
print grad_desc(2, 2)
Reduce training_rate so that the objective decreases at each iteration.
See Figure 6. in this paper: http://yann.lecun.com/exdb/publis/pdf/lecun-98b.pdf

Gradient in continuous regression using a neural network

I'm trying to implement a regression NN that has 3 layers (1 input, 1 hidden and 1 output layer with a continuous result). As a basis I took a classification NN from coursera.org class, but changed the cost function and gradient calculation so as to fit a regression problem (and not a classification one):
My nnCostFunction now is:
function [J grad] = nnCostFunctionLinear(nn_params, ...
input_layer_size, ...
hidden_layer_size, ...
num_labels, ...
X, y, lambda)
Theta1 = reshape(nn_params(1:hidden_layer_size * (input_layer_size + 1)), ...
hidden_layer_size, (input_layer_size + 1));
Theta2 = reshape(nn_params((1 + (hidden_layer_size * (input_layer_size + 1))):end), ...
num_labels, (hidden_layer_size + 1));
m = size(X, 1);
a1 = X;
a1 = [ones(m, 1) a1];
a2 = a1 * Theta1';
a2 = [ones(m, 1) a2];
a3 = a2 * Theta2';
Y = y;
J = 1/(2*m)*sum(sum((a3 - Y).^2))
th1 = Theta1;
th1(:,1) = 0; %set bias = 0 in reg. formula
th2 = Theta2;
th2(:,1) = 0;
t1 = th1.^2;
t2 = th2.^2;
th = sum(sum(t1)) + sum(sum(t2));
th = lambda * th / (2*m);
J = J + th; %regularization
del_3 = a3 - Y;
t1 = del_3'*a2;
Theta2_grad = 2*(t1)/m + lambda*th2/m;
t1 = del_3 * Theta2;
del_2 = t1 .* a2;
del_2 = del_2(:,2:end);
t1 = del_2'*a1;
Theta1_grad = 2*(t1)/m + lambda*th1/m;
grad = [Theta1_grad(:) ; Theta2_grad(:)];
end
Then I use this func in fmincg algorithm, but in firsts iterations fmincg end it's work. I think my gradient is wrong, but I can't find the error.
Can anybody help?
If I understand correctly, your first block of code (shown below) -
m = size(X, 1);
a1 = X;
a1 = [ones(m, 1) a1];
a2 = a1 * Theta1';
a2 = [ones(m, 1) a2];
a3 = a2 * Theta2';
Y = y;
is to get the output a(3) at the output layer.
Ng's slides about NN has the below configuration to calculate a(3). It's different from what your code presents.
in the middle/output layer, you are not doing the activation function g, e.g., a sigmoid function.
In terms of the cost function J without regularization terms, Ng's slides has the below formula:
I don't understand why you can compute it using:
J = 1/(2*m)*sum(sum((a3 - Y).^2))
because you are not including the log function at all.
Mikhaill, I´ve been playing with a NN for continuous regression as well, and had a similar issues at some point. The best thing to do here would be to test gradient computation against a numerical calculation before running the model. If that´s not correct, fmincg won´t be able to train the model. (Btw, I discourage you of using numerical gradient as the time involved is much bigger).
Taking into account that you took this idea from Ng´s Coursera class, I´ll implement a possible solution for you to try using the same notation for Octave.
% Cost function without regularization.
J = 1/2/m^2*sum((a3-Y).^2);
% In case it´s needed, regularization term is added (i.e. for Training).
if (reg==true);
J=J+lambda/2/m*(sum(sum(Theta1(:,2:end).^2))+sum(sum(Theta2(:,2:end).^2)));
endif;
% Derivatives are computed for layer 2 and 3.
d3=(a3.-Y);
d2=d3*Theta2(:,2:end);
% Theta grad is computed without regularization.
Theta1_grad=(d2'*a1)./m;
Theta2_grad=(d3'*a2)./m;
% Regularization is added to grad computation.
Theta1_grad(:,2:end)=Theta1_grad(:,2:end)+(lambda/m).*Theta1(:,2:end);
Theta2_grad(:,2:end)=Theta2_grad(:,2:end)+(lambda/m).*Theta2(:,2:end);
% Unroll gradients.
grad = [Theta1_grad(:) ; Theta2_grad(:)];
Note that, since you have taken out all the sigmoid activation, the derivative calculation is quite simple and results in a simplification of the original code.
Next steps:
1. Check this code to understand if it makes sense to your problem.
2. Use gradient checking to test gradient calculation.
3. Finally, use fmincg and check you get different results.
Try to include sigmoid function to compute second layer (hidden layer) values and avoid sigmoid in calculating the target (output) value.
function [J grad] = nnCostFunction1(nnParams, ...
inputLayerSize, ...
hiddenLayerSize, ...
numLabels, ...
X, y, lambda)
Theta1 = reshape(nnParams(1:hiddenLayerSize * (inputLayerSize + 1)), ...
hiddenLayerSize, (inputLayerSize + 1));
Theta2 = reshape(nnParams((1 + (hiddenLayerSize * (inputLayerSize + 1))):end), ...
numLabels, (hiddenLayerSize + 1));
Theta1Grad = zeros(size(Theta1));
Theta2Grad = zeros(size(Theta2));
m = size(X,1);
a1 = [ones(m, 1) X]';
z2 = Theta1 * a1;
a2 = sigmoid(z2);
a2 = [ones(1, m); a2];
z3 = Theta2 * a2;
a3 = z3;
Y = y';
r1 = lambda / (2 * m) * sum(sum(Theta1(:, 2:end) .* Theta1(:, 2:end)));
r2 = lambda / (2 * m) * sum(sum(Theta2(:, 2:end) .* Theta2(:, 2:end)));
J = 1 / ( 2 * m ) * (a3 - Y) * (a3 - Y)' + r1 + r2;
delta3 = a3 - Y;
delta2 = (Theta2' * delta3) .* sigmoidGradient([ones(1, m); z2]);
delta2 = delta2(2:end, :);
Theta2Grad = 1 / m * (delta3 * a2');
Theta2Grad(:, 2:end) = Theta2Grad(:, 2:end) + lambda / m * Theta2(:, 2:end);
Theta1Grad = 1 / m * (delta2 * a1');
Theta1Grad(:, 2:end) = Theta1Grad(:, 2:end) + lambda / m * Theta1(:, 2:end);
grad = [Theta1Grad(:) ; Theta2Grad(:)];
end
Normalize the inputs before passing it in nnCostFunction.
In accordance with Week 5 Lecture Notes guideline for a Linear System NN you should make following changes in the initial code:
Remove num_lables or make it 1 (in reshape() as well)
No need to convert y into a logical matrix
For a2 - replace sigmoid() function to tanh()
In d2 calculation - replace sigmoidGradient(z2) with (1-tanh(z2).^2)
Remove sigmoid from output layer (a3 = z3)
Replace cost function in the unregularized portion to linear one: J = (1/(2*m))*sum((a3-y).^2)
Create predictLinear(): use predict() function as a basis, replace sigmoid with tanh() for the first layer hypothesis, remove second sigmoid for the second layer hypothesis, remove the line with max() function, use output of the hidden layer hypothesis as a prediction result
Verify your nnCostFunctionLinear() on the test case from the lecture note

Resources