How does Roblox's math.noise() deal with negative inputs? - lua

While messing around with noise outside of Roblox, I realized Perlin/Simplex Noise does not like negative inputs. Remembering Roblox has a noise function, I tried there, and found out negative numbers do work nicely for Roblox's math.noise(). Does anybody know how they made this work, or how to get negative numbers to work for Perlin/Simplex noise in general?
The Simplex Noise I am using (copied from here but changed to have the bitwise and operation):
local function bit_and(a, b) --bitwise and operation
local p, c = 1, 0
while a > 0 and b > 0 do
local ra, rb = a%2, b%2
if (ra + rb) > 1 then
c = c + p
end
a = (a - ra) / 2
b = (b - rb) / 2
p = p * 2
end
return c
end
-- 2D simplex noise
local grad3 = {
{1,1,0},{-1,1,0},{1,-1,0},{-1,-1,0},
{1,0,1},{-1,0,1},{1,0,-1},{-1,0,-1},
{0,1,1},{0,-1,1},{0,1,-1},{0,-1,-1}
}
local p = {151,160,137,91,90,15,
131,13,201,95,96,53,194,233,7,225,140,36,103,30,69,142,8,99,37,240,21,10,23,
190, 6,148,247,120,234,75,0,26,197,62,94,252,219,203,117,35,11,32,57,177,33,
88,237,149,56,87,174,20,125,136,171,168, 68,175,74,165,71,134,139,48,27,166,
77,146,158,231,83,111,229,122,60,211,133,230,220,105,92,41,55,46,245,40,244,
102,143,54, 65,25,63,161, 1,216,80,73,209,76,132,187,208, 89,18,169,200,196,
135,130,116,188,159,86,164,100,109,198,173,186, 3,64,52,217,226,250,124,123,
5,202,38,147,118,126,255,82,85,212,207,206,59,227,47,16,58,17,182,189,28,42,
223,183,170,213,119,248,152, 2,44,154,163, 70,221,153,101,155,167, 43,172,9,
129,22,39,253, 19,98,108,110,79,113,224,232,178,185, 112,104,218,246,97,228,
251,34,242,193,238,210,144,12,191,179,162,241, 81,51,145,235,249,14,239,107,
49,192,214, 31,181,199,106,157,184, 84,204,176,115,121,50,45,127, 4,150,254,
138,236,205,93,222,114,67,29,24,72,243,141,128,195,78,66,215,61,156,180}
local perm = {}
for i=0,511 do
perm[i+1] = p[bit_and(i, 255) + 1]
end
local function dot(g, ...)
local v = {...}
local sum = 0
for i=1,#v do
sum = sum + v[i] * g[i]
end
return sum
end
local noise = {}
function noise.produce(xin, yin)
local n0, n1, n2 -- Noise contributions from the three corners
-- Skew the input space to determine which simplex cell we're in
local F2 = 0.5*(math.sqrt(3.0)-1.0)
local s = (xin+yin)*F2; -- Hairy factor for 2D
local i = math.floor(xin+s)
local j = math.floor(yin+s)
local G2 = (3.0-math.sqrt(3.0))/6.0
local t = (i+j)*G2
local X0 = i-t -- Unskew the cell origin back to (x,y) space
local Y0 = j-t
local x0 = xin-X0 -- The x,y distances from the cell origin
local y0 = yin-Y0
-- For the 2D case, the simplex shape is an equilateral triangle.
-- Determine which simplex we are in.
local i1, j1 -- Offsets for second (middle) corner of simplex in (i,j) coords
if x0 > y0 then
i1 = 1
j1 = 0 -- lower triangle, XY order: (0,0)->(1,0)->(1,1)
else
i1 = 0
j1 = 1
end-- upper triangle, YX order: (0,0)->(0,1)->(1,1)
-- A step of (1,0) in (i,j) means a step of (1-c,-c) in (x,y), and
-- a step of (0,1) in (i,j) means a step of (-c,1-c) in (x,y), where
-- c = (3-sqrt(3))/6
local x1 = x0 - i1 + G2 -- Offsets for middle corner in (x,y) unskewed coords
local y1 = y0 - j1 + G2
local x2 = x0 - 1 + 2 * G2 -- Offsets for last corner in (x,y) unskewed coords
local y2 = y0 - 1 + 2 * G2
-- Work out the hashed gradient indices of the three simplex corners
local ii = bit_and(i, 255)
local jj = bit_and(j, 255)
local gi0 = perm[ii + perm[jj+1]+1] % 12
local gi1 = perm[ii + i1 + perm[jj + j1+1]+1] % 12
local gi2 = perm[ii + 1 + perm[jj + 1+1]+1] % 12
-- Calculate the contribution from the three corners
local t0 = 0.5 - x0 * x0 - y0 * y0
if t0 < 0 then
n0 = 0.0
else
t0 = t0 * t0
n0 = t0 * t0 * dot(grad3[gi0+1], x0, y0) -- (x,y) of grad3 used for 2D gradient
end
local t1 = 0.5 - x1 * x1 - y1 * y1
if t1 < 0 then
n1 = 0.0
else
t1 = t1 * t1
n1 = t1 * t1 * dot(grad3[gi1+1], x1, y1)
end
local t2 = 0.5 - x2 * x2 - y2 * y2
if t2 < 0 then
n2 = 0.0
else
t2 = t2 * t2
n2 = t2 * t2 * dot(grad3[gi2+1], x2, y2)
end
-- Add contributions from each corner to get the final noise value.
-- The result is scaled to return values in the interval [-1,1].
return 70.0 * (n0 + n1 + n2)
end
return noise

The Lua programming language version that Roblox uses, LuaU (or Luau), is actually open-source since November of 2021. You can find it here. The math library can be found in this file called lmathlib.cpp and it contains the math.noise function along with internal functions to calculate it, perlin (main function), grad, lerp, and fade. It's a quite complicated thing I can't explain myself, but I have converted it into Lua here.

Related

How can I set transition.to to continue in the same direction with the same velocity once it has arrived at the set point?

Hi I am new to coding and I am using Lua and solar2d, trying to transition object1 via another object2's co-ordinates and for object1 to continue along the same path with the same velocity if it doesn't hit object2.
I can easily transition to the obeject but I don't know how to then go beyond that.
transition.to( object1, { x=object2.x, y=object2.y, time=3000, })
I feel I will have to add an oncomplete but not sure what.
any help would be greatly appreciated.
You have to calculate the equation of the line (y = m * x + b) that you are traveling.
Formulas:
m = (y2 - y1) / (x2 - x1)
b = y1 - m * x1
So in your case:
m = (object2.y - object1.y) / (object2.x - object1.x)
b = object1.y - m * object1.x
Now you have the equation of the path (line) to keep if object1 doesn't hit object2.
When the transition ends, you want to check if the object2 is still there (object1 hit it) or not (object1 keeps moving), so you need to include an onComplete listener to check for that.
As for the speed, you have to decide if you want a constant speed and then you have to calculate the time for each transition or if you are using always 3 seconds no matter if the object2 is close or far away from the object1. I guess you probably want the first option, so it doesn't go pretty slow if objects are close and too fast if the object are far away. In that case you have to set a constant speed s, that you want.
Formulas:
Speed = Distance / Time
Time = Distance / Speed
Distance between 2 points:
d = squareRoot( (x2 - x1)^2 + (y2 - y1)^2 )
In summary, it would be something like that:
s = 10 --Constant speed
m = (object2.y - object1.y) / (object2.x - object1.x)
b = object1.y - m * object1.x
direction = 1 --assume it's traveling to the right
if(object2.x < object1.x)then
direction = -1 --it's traveling to the left
end
local function checkCollision( obj )
if(obj.x == object2.x and obj.y == object2.y)then
-- Object1 hit Object2
else
-- Object2 is not here anymore, continue until it goes offscreen
-- following the line equation
x3 = -10 -- if it's traveling to the left
if(direction == 1)then
--it's traveling to the right
x3 = display.contentWidth + 10
end
y3 = m * x3 + b
d2 = math.sqrt( (x3 - obj.x)^2 + (y3 - obj.y)^2 )
t2 = d2 / s
transition.to( obj, {x=x3, y=y3, time=t2} )
end
end
d1 = math.sqrt( (object2.x - object1.x)^2 + (object2.y - object1.y)^2 )
t1 = d1 / s
transition.to( object1, { x=object2.x, y=object2.y, time=t1, onComplete=checkCollision} )
You should try different values for the speed s until you get the desired movement.

Pytorch, slicing tensor causes RuntimeError:: one of the variables needed for gradient computation has been modified by an inplace operation:

I wrote a RNN with LSTM cell with Pycharm. The peculiarity of this network is that the output of the RNN is fed into a integration opeartion, computed with Runge-kutta.
The integration takes some input and propagate that in time one step ahead. In order to do so I need to slice the feature tensor X along the batch dimension, and pass this to the Runge-kutta.
class MyLSTM(torch.nn.Module):
def __init__(self, ni, no, sampling_interval, nh=10, nlayers=1):
super(MyLSTM, self).__init__()
self.device = torch.device("cpu")
self.dtype = torch.float
self.ni = ni
self.no = no
self.nh = nh
self.nlayers = nlayers
self.lstms = torch.nn.ModuleList(
[torch.nn.LSTMCell(self.ni, self.nh)] + [torch.nn.LSTMCell(self.nh, self.nh) for i in range(nlayers - 1)])
self.out = torch.nn.Linear(self.nh, self.no)
self.do = torch.nn.Dropout(p=0.2)
self.actfn = torch.nn.Sigmoid()
self.sampling_interval = sampling_interval
self.scaler_states = None
# Options
# description of the whole block
def forward(self, x, h0, train=False, integrate_ode=True):
x0 = x.clone().requires_grad_(True)
hs = x # initiate hidden state
if h0 is None:
h = torch.zeros(hs.shape[0], self.nh, device=self.device)
c = torch.zeros(hs.shape[0], self.nh, device=self.device)
else:
(h, c) = h0
# LSTM cells
for i in range(self.nlayers):
h, c = self.lstms[i](hs, (h, c))
if train:
hs = self.do(h)
else:
hs = h
# Output layer
# y = self.actfn(self.out(hs))
y = self.out(hs)
if integrate_ode:
p = y
y = self.integrate(x0, p)
return y, (h, c)
def integrate(self, x0, p):
# RK4 steps per interval
M = 4
DT = self.sampling_interval / M
X = x0
# X = self.scaler_features.inverse_transform(x0)
for b in range(X.shape[0]):
xx = X[b, :]
for j in range(M):
k1 = self.ode(xx, p[b, :])
k2 = self.ode(xx + DT / 2 * k1, p[b, :])
k3 = self.ode(xx + DT / 2 * k2, p[b, :])
k4 = self.ode(xx + DT * k3, p[b, :])
xx = xx + DT / 6 * (k1 + 2 * k2 + 2 * k3 + k4)
X_all[b, :] = xx
return X_all
def ode(self, x0, y):
# Here I a dynamic model
I get this error:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor []], which is output 0 of SelectBackward, is at version 64; expected version 63 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
the problem is in the operations xx = X[b, :] and p[b,:]. I know that because I choose batch dimension of 1, then I can replace the previous two equations with xx=X and p, and this works. How can split the tensor without loosing the gradient?
I had the same question, and after a lot of searching, I added .detach() function after "h" and "c" in the RNN cell.

Gradient Descent cost function explosion

I am writing this code for linear regression and trying Gradient Descent to minimize the RSS. The cost function seems to explode to infinity within 12 iterations. I know this is not supposed to happen. Maybe, I have used the wrong gradient function for RSS (can be seen in the function "grad()")?
NumberObservations=100
minVal=1
maxVal=20
X = np.random.uniform(minVal,maxVal,(NumberObservations,1))
e = np.random.normal(0, 1, (NumberObservations,1))
Y= 10 + 5*X + e
B = np.array([[0], [0]])
sum_y = sum(Y)
sum_x = sum(X)
sum_xy = sum(np.multiply(X, Y))
sum_x2 = sum(X*X)
alpha = 0.00001
iterations = 15
def cost_fun(X, Y, B):
b0 = B[0]
b1 = B[1]
s = (Y - (b0 + (b1*X)))**2
rss = sum(s)
return rss
def grad(X, Y, B):
print("B = " + str(B))
b0 = B[0]
b1 = B[1]
g0 = -2*(Y - b0 - (b1*X))
g1 = -2*((X*Y) - (b0*X) - (b1*X**2))
grad = np.concatenate((g0, g1), axis = 1)
return grad
def gradient_descent(X, Y, B, alpha, iterations):
cost_history = [0] * iterations
m = len(Y)
x0 = np.array(np.ones(m))
x0 = x0.reshape((100, 1))
X1 = np.concatenate((x0, X), axis = 1)
for iteration in range(iterations):
h = np.dot(X1, B)
h = h.reshape((100, 1))
loss = h - Y
g = grad(X, Y, B)
gradient = (np.dot(g.T, loss) / m)
B = B - alpha * gradient
cost = cost_fun(X, Y, B)
cost_history[iteration] = cost
print("Iteration %d | Cost: %f" % (iteration, cost))
print("-----------------------------------------------------------------------")
return B, cost_history
newB, cost_history = gradient_descent(X, Y, B, alpha, iterations)
# New Values of B
print(newB)
Please help.

how to write private function in lua class

I'm trying to write a lua "class" with a private function like this:
local myTable = {}
function myTable.func()
private()
end
local function private()
print(":O")
end
return myTable
Let's say I'll require myTable and then run myTable.func() I'll get an error that says private is not defined.
I've found 2 ways to solve this:
move the function private in front of func
Forward declare local private before func and change the signature of private to function private.
But I'm a little confused about why they are working and which is the common way.
which is the common way
Both work and both are advisable. The second approach is needed in situations where you have two functions that call each other and both need to be local but not inside a table.
You could always use the second style and thus keep consistency, though it might not be as readable as you would need to go to a different place in code to see if your function is local.
However for readability and shorter code I would use the first approach so I don't need a separate "declaration" of my local functions.
im little confused about why they are working
The reason the original code does not work is because of local variable scope.
From lua reference manual:
Lua is a lexically scoped language. The scope of a local variable
begins at the first statement after its declaration and lasts until
the last non-void statement of the innermost block that includes the
declaration.
So in your original code the variable private is defined as the function only after the line where it is defined. And the code fails because you try to use it in code that is before that line.
The approaches work because both move the local variable scope to start above the code where you use the variable.
You may want to read about local variables and the scoping in the reference manual:
http://www.lua.org/manual/5.2/manual.html#3.3.7
http://www.lua.org/manual/5.2/manual.html#3.5
First of all: In your code snippet it's not clear to me where the "class" is, as myTable is just an object. If you put this in a module and require it, you just get an object.
You could do something like this:
local function MyTable() -- constructor
local function private()
print(":O")
end
return {
func = function()
private()
end
}
end
local m = MyTable()
m.func()
This may not be the usual way of doing OOP in Lua, but here private obviously is .. well .. private.
I created this samplecode:
local object = {}
do -- Creates Scope
-- Private Scope
local fire_rate = 5
-- Public Scope
function object:load()
end
function object:update()
end
function object:draw()
end
function object:setFireRate(rate)
fire_rate = rate
end
function object:getFireRate()
return fire_rate
end
end
return object
Hope this helps.
You'll basically need something like this:
local function Bezier(x1,y1,x2,y2,x3,y3)
--Private
local inf = 1/0
local x1 = x1
local y1 = y1
local x2 = x2
local y2 = y2
local x3 = x3
local y3 = y3
local maxY = y1 > y2 and (y1 > y3 and y1 or y3) or y2 > y3 and y2 or y3
local minY = y1 < y2 and (y1 < y3 and y1 or y3) or y2 < y3 and y2 or y3
local maxX = x1 > x2 and (x1 > x3 and x1 or x3) or x2 > x3 and x2 or x3
local minX = x1 < x2 and (x1 < x3 and x1 or x3) or x2 < x3 and x2 or x3
local xc = (x3 - 2*x2 + x1)
local xb = 2*(x2 - x1)
local yc = (y3 - 2*y2 + y1)
local yb = 2*(y2 - y1)
--Public
local self = {}
--Render
self.render = function(resolution)
local path = {}
local num = 1
for index=0, 1, 1/resolution do
path[num] = {(1-index)^2*x1+2*(1-index)*index*x2+index^2*x3, (1-index)^2*y1+2*(1-index)*index*y2+index^2*y3}
num = num + 1
end
return path
end
--Point
function self.point(index)
return {(1-index)^2*x1+2*(1-index)*index*x2+index^2*x3, (1-index)^2*y1+2*(1-index)*index*y2+index^2*y3}
end
--Get x of patricular y
function self.getX(y)
if y > maxY or y < minY then
return
end
local a = y1 - y
if a == 0 then
return
end
local b = yb
local c = yc
local discriminant = (b^2 - 4*a*c )
if discriminant < 0 then
return
else
local aByTwo = 2*a
if discriminant == 0 then
local index1 = -b/aByTwo
if 0 < index1 and index1 < 1 then
print("=====",y,1,maxY,minY)
return (1-index1)^2*x1+2*(1-index1)*index1*x2+index1^2*x3
end
else
local theSQRT = math.sqrt(discriminant)
local index1, index2 = (-b -theSQRT)/aByTwo, (-b +theSQRT)/aByTwo
if 0 < index1 and index1 < 1 then
if 0 < index2 and index2 < 1 then
print("=====",y,2,maxY,minY)
return (1-index1)^2*x1+2*(1-index1)*index1*x2+index1^2*x3, (1-index2)^2*x1+2*(1-index2)*index2*x2+index2^2*x3
else
print("=====",y,1,maxY,minY)
return (1-index1)^2*x1+2*(1-index1)*index1*x2+index1^2*x3
end
elseif 0 < index2 and index2 < 1 then
print("=====",y,1,maxY,minY)
return (1-index2)^2*x1+2*(1-index2)*index2*x2+index2^2*x3
end
end
end
end
--Get y of patricular x
function self.getY(x)
if x > maxX or x < minX then
return
end
if maxX == minX and x == minX then
return minY, maxY
end
local index1, index2, buffer1, buffer2
local a = (x1 - x)
if a == 0 then
return
end
local b = xb
local c = xc
local discriminant = b^2 - 4*a*c
if discriminant < 0 then
return
else
local aByTwo = 2*a
local theSQRT = math.sqrt(discriminant)
if discriminant == 0 then
local index1 = -b/aByTwo
return (1-index1)^2*y1+2*(1-index1)*index1*y2+index1^2*y3
else
local index1, index2 = (-b - theSQRT)/aByTwo, (-b + theSQRT)/aByTwo
return (1-index1)^2*y1+2*(1-index1)*index1*y2+index1^2*y3, (1-index2)^2*y1+2*(1-index2)*index2*y2+index2^2*y3
end
end
end
--Scanline render
function self.scanRender()
local path = {}
local counter = 1
local fX, sX
local a = (y3 - 2*y2 + y1)
local b = 2*(y2 - y1)
for i=minY, maxY do
fX, sX = self.getX(i,a,b)
if fX then
path[counter] = fX
path[counter+1] = i
counter = counter + 2
if sX then
path[counter] = sX
path[counter+1] = i
counter = counter + 2
end
end
end
return path
end
--More efficient
--Self
return self
end
By calling bezier, you get a Bezier object. This object will be able to access all the private attributes and the public interface that is in the self table.

Gradient in continuous regression using a neural network

I'm trying to implement a regression NN that has 3 layers (1 input, 1 hidden and 1 output layer with a continuous result). As a basis I took a classification NN from coursera.org class, but changed the cost function and gradient calculation so as to fit a regression problem (and not a classification one):
My nnCostFunction now is:
function [J grad] = nnCostFunctionLinear(nn_params, ...
input_layer_size, ...
hidden_layer_size, ...
num_labels, ...
X, y, lambda)
Theta1 = reshape(nn_params(1:hidden_layer_size * (input_layer_size + 1)), ...
hidden_layer_size, (input_layer_size + 1));
Theta2 = reshape(nn_params((1 + (hidden_layer_size * (input_layer_size + 1))):end), ...
num_labels, (hidden_layer_size + 1));
m = size(X, 1);
a1 = X;
a1 = [ones(m, 1) a1];
a2 = a1 * Theta1';
a2 = [ones(m, 1) a2];
a3 = a2 * Theta2';
Y = y;
J = 1/(2*m)*sum(sum((a3 - Y).^2))
th1 = Theta1;
th1(:,1) = 0; %set bias = 0 in reg. formula
th2 = Theta2;
th2(:,1) = 0;
t1 = th1.^2;
t2 = th2.^2;
th = sum(sum(t1)) + sum(sum(t2));
th = lambda * th / (2*m);
J = J + th; %regularization
del_3 = a3 - Y;
t1 = del_3'*a2;
Theta2_grad = 2*(t1)/m + lambda*th2/m;
t1 = del_3 * Theta2;
del_2 = t1 .* a2;
del_2 = del_2(:,2:end);
t1 = del_2'*a1;
Theta1_grad = 2*(t1)/m + lambda*th1/m;
grad = [Theta1_grad(:) ; Theta2_grad(:)];
end
Then I use this func in fmincg algorithm, but in firsts iterations fmincg end it's work. I think my gradient is wrong, but I can't find the error.
Can anybody help?
If I understand correctly, your first block of code (shown below) -
m = size(X, 1);
a1 = X;
a1 = [ones(m, 1) a1];
a2 = a1 * Theta1';
a2 = [ones(m, 1) a2];
a3 = a2 * Theta2';
Y = y;
is to get the output a(3) at the output layer.
Ng's slides about NN has the below configuration to calculate a(3). It's different from what your code presents.
in the middle/output layer, you are not doing the activation function g, e.g., a sigmoid function.
In terms of the cost function J without regularization terms, Ng's slides has the below formula:
I don't understand why you can compute it using:
J = 1/(2*m)*sum(sum((a3 - Y).^2))
because you are not including the log function at all.
Mikhaill, I´ve been playing with a NN for continuous regression as well, and had a similar issues at some point. The best thing to do here would be to test gradient computation against a numerical calculation before running the model. If that´s not correct, fmincg won´t be able to train the model. (Btw, I discourage you of using numerical gradient as the time involved is much bigger).
Taking into account that you took this idea from Ng´s Coursera class, I´ll implement a possible solution for you to try using the same notation for Octave.
% Cost function without regularization.
J = 1/2/m^2*sum((a3-Y).^2);
% In case it´s needed, regularization term is added (i.e. for Training).
if (reg==true);
J=J+lambda/2/m*(sum(sum(Theta1(:,2:end).^2))+sum(sum(Theta2(:,2:end).^2)));
endif;
% Derivatives are computed for layer 2 and 3.
d3=(a3.-Y);
d2=d3*Theta2(:,2:end);
% Theta grad is computed without regularization.
Theta1_grad=(d2'*a1)./m;
Theta2_grad=(d3'*a2)./m;
% Regularization is added to grad computation.
Theta1_grad(:,2:end)=Theta1_grad(:,2:end)+(lambda/m).*Theta1(:,2:end);
Theta2_grad(:,2:end)=Theta2_grad(:,2:end)+(lambda/m).*Theta2(:,2:end);
% Unroll gradients.
grad = [Theta1_grad(:) ; Theta2_grad(:)];
Note that, since you have taken out all the sigmoid activation, the derivative calculation is quite simple and results in a simplification of the original code.
Next steps:
1. Check this code to understand if it makes sense to your problem.
2. Use gradient checking to test gradient calculation.
3. Finally, use fmincg and check you get different results.
Try to include sigmoid function to compute second layer (hidden layer) values and avoid sigmoid in calculating the target (output) value.
function [J grad] = nnCostFunction1(nnParams, ...
inputLayerSize, ...
hiddenLayerSize, ...
numLabels, ...
X, y, lambda)
Theta1 = reshape(nnParams(1:hiddenLayerSize * (inputLayerSize + 1)), ...
hiddenLayerSize, (inputLayerSize + 1));
Theta2 = reshape(nnParams((1 + (hiddenLayerSize * (inputLayerSize + 1))):end), ...
numLabels, (hiddenLayerSize + 1));
Theta1Grad = zeros(size(Theta1));
Theta2Grad = zeros(size(Theta2));
m = size(X,1);
a1 = [ones(m, 1) X]';
z2 = Theta1 * a1;
a2 = sigmoid(z2);
a2 = [ones(1, m); a2];
z3 = Theta2 * a2;
a3 = z3;
Y = y';
r1 = lambda / (2 * m) * sum(sum(Theta1(:, 2:end) .* Theta1(:, 2:end)));
r2 = lambda / (2 * m) * sum(sum(Theta2(:, 2:end) .* Theta2(:, 2:end)));
J = 1 / ( 2 * m ) * (a3 - Y) * (a3 - Y)' + r1 + r2;
delta3 = a3 - Y;
delta2 = (Theta2' * delta3) .* sigmoidGradient([ones(1, m); z2]);
delta2 = delta2(2:end, :);
Theta2Grad = 1 / m * (delta3 * a2');
Theta2Grad(:, 2:end) = Theta2Grad(:, 2:end) + lambda / m * Theta2(:, 2:end);
Theta1Grad = 1 / m * (delta2 * a1');
Theta1Grad(:, 2:end) = Theta1Grad(:, 2:end) + lambda / m * Theta1(:, 2:end);
grad = [Theta1Grad(:) ; Theta2Grad(:)];
end
Normalize the inputs before passing it in nnCostFunction.
In accordance with Week 5 Lecture Notes guideline for a Linear System NN you should make following changes in the initial code:
Remove num_lables or make it 1 (in reshape() as well)
No need to convert y into a logical matrix
For a2 - replace sigmoid() function to tanh()
In d2 calculation - replace sigmoidGradient(z2) with (1-tanh(z2).^2)
Remove sigmoid from output layer (a3 = z3)
Replace cost function in the unregularized portion to linear one: J = (1/(2*m))*sum((a3-y).^2)
Create predictLinear(): use predict() function as a basis, replace sigmoid with tanh() for the first layer hypothesis, remove second sigmoid for the second layer hypothesis, remove the line with max() function, use output of the hidden layer hypothesis as a prediction result
Verify your nnCostFunctionLinear() on the test case from the lecture note

Resources