My gradient descent is not giving the exact value - machine-learning

I have written gradient descent algorithm in Octave but it is not giving me the exact answer. The answer differs from one to two digits.
Here is my code:
function theta = gradientDescent(X, y, theta, alpha, num_iters)
m = length(y); % number of training examples
s = 0;
temp = theta;
for iter = 1:num_iters
for j = 1:size(theta, 1)
for i = 1:m
h = theta' * X(i, :)';
s = s + (h - y(i))*X(i, j);
end
s = s/m;
temp(j) = temp(j) - alpha * s;
end
theta = temp;
end
end
For:
theta = gradientDescent([1 5; 1 2; 1 4; 1 5],[1 6 4 2]',[0 0]',0.01,1000);
My gradient descent gives this:
4.93708
-0.50549
But it is expected to give this:
5.2148
-0.5733

Minor fixes :
Your variable s probably the delta is initialised incorrectly.
So it the temp variable probably the new theta
Incorrectly calculating the delta
Try with below changes.
function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);
temp = theta;
for iter = 1:num_iters
temp = zeros(length(theta), 1);
for j = 1:size(theta)
s = 0
for i = 1:m
s = s + (X(i, :)*theta - y(i)) * X(i, j);
end
end
s = s/m;
temp(j) = temp(j) - alpha * s;
end
theta = temp;
J_history(iter) = computeCost(X, y, theta);
end
end

Related

Why my cost function is giving wrong answer?

I have written a code for the cost function and it is giving incorrect answer.
I have read the code many times but I cannot find the mistake.
Here is my code:-
function J = computeCost(X, y, theta)
m = length(y); % number of training examples
s = 0;
h = 0;
sq = 0;
J = 0;
for i = 1:m
h = theta' * X(i, :)';
sq = (h - y(i))^2;
s = s + sq;
end
J = (1/2*m) * s;
end
Example:-
computeCost( [1 2; 1 3; 1 4; 1 5], [7;6;5;4], [0.1;0.2] )
ans = 11.9450
Here the answer should be 11.9450 but my code is giving me this:-
ans = 191.12
I have checked the the matrix multiplication and the code is calculating it right.
It seems you misunderstood the operator evaluation order. In fact
1/2*m ~= 1/(2*m)
With this in mind it seems you're computing an average. Instead of reinventing the wheel it is usually a good idea to use the built in functions to do the job which results in a much clearer (and less error prone) implementation:
function J = computeCost(X, y, theta)
h = X * theta;
sq = (h - y).^2;
J = 1/2 * mean(sq);
end
computeCost( [1,2;1,3;1,4;1,5], [7;6;5;4], [0.1;0.2] )
% ans = 11.9450
Try it online!

Linear regression implementation in Octave

I recently tried implementing linear regression in octave and couldn't get past the online judge. Here's the code
function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);
for iter = 1:num_iters
for i = 1:m
temp1 = theta(1)-(alpha/m)*(X(i,:)*theta-y(i,:));
temp2 = theta(2)-(alpha/m)*(X(i,:)*theta-y(i,:))*X(i,2);
theta = [temp1;temp2];
endfor
J_history(iter) = computeCost(X, y, theta);
end
end
I am aware of the vectorized implementation but just wanted to try the iterative method. Any help would be appreciated.
You don't need the inner for loop. Instead you can use the sum function.
In code:
for iter = 1:num_iters
j= 1:m;
temp1 = sum((theta(1) + theta(2) .* X(j,2)) - y(j));
temp2 = sum(((theta(1) + theta(2) .* X(j,2)) - y(j)) .* X(j,2));
theta(1) = theta(1) - (alpha/m) * (temp1 );
theta(2) = theta(2) - (alpha/m) * (temp2 );
J_history(iter) = computeCost(X, y, theta);
end
It'd be a good exercise to implement the vectorized solution too and then compare both of them to see how in practice the efficiency of the the vectorization.

Convert cv::Vec4f line to cv::Vec2f

I have a pair of Cartesian coordinates that represent a line in an image. I would like to convert this line to polar form and draw it over the image.
e.g
cv::Vec4f line {10,20,60,70};
float x1 = line[0];
float y1 = line[1];
float x2 = line[2];
float y2 = line[3];
I want this line to be represented in cv::Vec2f form(rho,theta).
Taking care of rho & theta with all possible slopes.
Given are the image dimensions :: w and h;
w = image.cols
h = image.rows
How can I achieve this.
N.B: We can also assume that the line can be an extended one running across the image.
for (size_t i = 0; i < lines.size(); i++)
{
int x1 = lines[i][0];
int y1 = lines[i][1];
int x2 = lines[i][2];
int y2 = lines[i][3];
float d = sqrt(((y1-y2)*(y1-y2)) + ((x2-x1)*(x2-x1)) );
float rho = (y1*x2 - y2*x1)/d;
float theta = atan2(x2 - x1,y1-y2) ;
if(rho < 0){
theta *= -1;
rho *= -1;
}
linv2f.push_back(cv::Vec2f(rho,theta));
}
The above approach doesnt give me results when I plot the lines I dont get the lines that are overlapping their original vec4f form.
I use this to convert vec2f to vec4f for testing :
cv::Vec4f cvtVec2fLine(const cv::Vec2f& data, const cv::Mat& img)
{
float const rho = data[0];
float const theta = data[1];
cv::Point pt1,pt2;
if((theta < CV_PI/4. || theta > 3. * CV_PI/4.)){
pt1 = cv::Point(rho / std::cos(theta), 0);
pt2 = cv::Point( (rho - img.rows * std::sin(theta))/std::cos(theta), img.rows);
}else {
pt1 = cv::Point(0, rho / std::sin(theta));
pt2 = cv::Point(img.cols, (rho - img.cols * std::cos(theta))/std::sin(theta));
}
cv::Vec4f l;
l[0] = pt1.x;
l[1] = pt1.y;
l[2] = pt2.x;
l[3] = pt2.y;
return l;
}
rho-theta equation has form
x * Cos(Theta) + y * Sin(Theta) - Rho = 0
We want to represent equation 'by two points' into rho-theta form (page 92 in pdf here). If we have
x * A + y * B - C = 0
and need coefficients in trigonometric form, we can divide all equation by magnitude of (A,B) coefficient vector.
D = Length(A,B) = Math.Hypot(A,B)
x * A/D + y * B/D - C/D = 0
note that (A/D)^2 + (B/D)^2 = 1 - basic trigonometric equality, so we can consider A/D and B/D as cosine and sine of some angle theta.
Your line equation is
(y-y1) * (x2-x1) - (x-x1) * (y2-y1) = 0
or
x * (y1-y2) + y * (x2-x1) - (y1 * x2 - y2 * x1) = 0
let
D = Sqrt((y1-y2)^2 + (x2-x1)^2)
so
Theta = ArcTan2(x2-x1, y1-y2)
Rho = (y1 * x2 - y2 * x1) / D
edited
If Rho is negative, change sign of Rho and shift Theta by Pi
Example:
x1=1,y1=0, x2=0,y2=1
Theta = atan2(-1,-1)=-3*Pi/4
D=Sqrt(2)
Rho=-Sqrt(2)/2 negative =>
Rho = Sqrt(2)/2
Theta = Pi/4
Back substitutuon - find points of intersection with axes
0 * Sqrt(2)/2 + y0 * Sqrt(2)/2 - Sqrt(2)/2 = 0
x=0 y=1
x0 * Sqrt(2)/2 + 0 * Sqrt(2)/2 - Sqrt(2)/2 = 0
x=1 y=0

Gradient descent values not correct

I'm attempting to implement gradient descent using code from :
Gradient Descent implementation in octave
I've amended code to following :
X = [1; 1; 1;]
y = [1; 0; 1;]
m = length(y);
X = [ones(m, 1), data(:,1)];
theta = zeros(2, 1);
iterations = 2000;
alpha = 0.001;
for iter = 1:iterations
theta = theta -((1/m) * ((X * theta) - y)' * X)' * alpha;
end
theta
Which gives following output :
X =
1
1
1
y =
1
0
1
theta =
0.32725
0.32725
theta is a 1x2 Matrix but should'nt it be 1x3 as the output (y) is 3x1 ?
So I should be able to multiply theta by the training example to make a prediction but cannot multiply x by theta as x is 1x3 and theta is 1x2?
Update :
%X = [1 1; 1 1; 1 1;]
%y = [1 1; 0 1; 1 1;]
X = [1 1 1; 1 1 1; 0 0 0;]
y = [1 1 1; 0 0 0; 1 1 1;]
m = length(y);
X = [ones(m, 1), X];
theta = zeros(4, 1);
theta
iterations = 2000;
alpha = 0.001;
for iter = 1:iterations
theta = theta -((1/m) * ((X * theta) - y)' * X)' * alpha;
end
%to make prediction
m = size(X, 1); % Number of training examples
p = zeros(m, 1);
htheta = sigmoid(X * theta);
p = htheta >= 0.5;
You are misinterpreting dimensions here. Your data consists of 3 points, each having a single dimension. Furthermore, you add a dummy dimension of 1s
X = [ones(m, 1), data(:,1)];
thus
octave:1> data = [1;2;3]
data =
1
2
3
octave:2> [ones(m, 1), data(:,1)]
ans =
1 1
1 2
1 3
and theta is your parametrization, which you should be able to apply through (this is not a code, but math notation)
h(x) = x1 * theta1 + theta0
thus your theta should have two dimensions. One is a weight for your dummy dimension (so called bias) and one for actual X dimension. If your X has K dimensions, theta would have K+1. Thus, after adding a dummy dimension matrices have following shapes:
X is 3x2
y is 3x1
theta is 2x1
so
X * theta is 3x1
the same as y

Fast bilinear interpolation on old iOS devices

I've got the following code to do a biliner interpolation from a matrix of 2D vectors, each cell has x and y values of the vector, and the function receives k and l indices telling the bottom-left nearest position in the matrix
// p[1] returns the interpolated values
// fieldLinePointsVerts the raw data array of fieldNumHorizontalPoints x fieldNumVerticalPoints
// only fieldNumHorizontalPoints matters to determine the index to access the raw data
// k and l horizontal and vertical indices of the point just bellow p[0] in the raw data
void interpolate( vertex2d* p, vertex2d* fieldLinePointsVerts, int fieldNumHorizontalPoints, int k, int l ) {
int index = (l * fieldNumHorizontalPoints + k) * 2;
vertex2d p11;
p11.x = fieldLinePointsVerts[index].x;
p11.y = fieldLinePointsVerts[index].y;
vertex2d q11;
q11.x = fieldLinePointsVerts[index+1].x;
q11.y = fieldLinePointsVerts[index+1].y;
index = (l * fieldNumHorizontalPoints + k + 1) * 2;
vertex2d q21;
q21.x = fieldLinePointsVerts[index+1].x;
q21.y = fieldLinePointsVerts[index+1].y;
index = ( (l + 1) * fieldNumHorizontalPoints + k) * 2;
vertex2d q12;
q12.x = fieldLinePointsVerts[index+1].x;
q12.y = fieldLinePointsVerts[index+1].y;
index = ( (l + 1) * fieldNumHorizontalPoints + k + 1 ) * 2;
vertex2d p22;
p22.x = fieldLinePointsVerts[index].x;
p22.y = fieldLinePointsVerts[index].y;
vertex2d q22;
q22.x = fieldLinePointsVerts[index+1].x;
q22.y = fieldLinePointsVerts[index+1].y;
float fx = 1.0 / (p22.x - p11.x);
float fx1 = (p22.x - p[0].x) * fx;
float fx2 = (p[0].x - p11.x) * fx;
vertex2d r1;
r1.x = fx1 * q11.x + fx2 * q21.x;
r1.y = fx1 * q11.y + fx2 * q21.y;
vertex2d r2;
r2.x = fx1 * q12.x + fx2 * q22.x;
r2.y = fx1 * q12.y + fx2 * q22.y;
float fy = 1.0 / (p22.y - p11.y);
float fy1 = (p22.y - p[0].y) * fy;
float fy2 = (p[0].y - p11.y) * fy;
p[1].x = fy1 * r1.x + fy2 * r2.x;
p[1].y = fy1 * r1.y + fy2 * r2.y;
}
Currently this code needs to be run every single frame in old iOS devices, say devices with arm6 processors
I've taken the numeric sub-indices from the wikipedia's equations http://en.wikipedia.org/wiki/Bilinear_interpolation
I'd accreciate any comments on optimization for performance, even plain asm code
This code should not be causing your slowdown if it's only run once per frame. However, if it's run multiple times per frame, it easily could be.
I'd run your app with a profiler to see where the true performance problem lies.
There is some room for optimization here: a) Certain index calculations could be factored out and re-used in subsequent calculations), b) You could dereference your fieldLinePointsVerts array to a pointer once and re-use that, instead of indexing it twice per index...
but in general those things won't help a great deal, unless this function is being called many, many times per frame. In which case every little thing will help.

Resources