Greedy algorithm for a given unsorted input with time complexity "nlogn" - greedy

Multi-Constrained Knapsack Problem
i have such a given example ,i m just trying to understand, whats the difference between greedy algorithm with O(n*logn) and greedy algorithm for O(n2)? I really do not know how to start please help! Should i sort it or something different :( ?
(profit and weight ratio is not in a decreasing or increasing order,totally random)
p = (p1; : : : ; pn) = (24; 17; 95; 103; 41; 39; 22; 1)
w = (w1; : : : ;wn) = (20; 15; 39; 41; 27; 23; 18; 2)

Yup, the O(n log n) algorithm is to sort by (profit / weight) in decreasing order, and then grab as many objects as possible up to the weight limit. Of course the sorting algorithm has to be O(n log n).
The naive (O(n^2)) algorithm would be to repeatedly search the list for the item with the highest (profit / weight) ratio and grab that. Note that this is, in effect, the same as doing a selection sort.

Related

Linear Regression - Implementing Feature Scaling

I was trying to implement Linear Regression in Octave 5.1.0 on a data set relating the GRE score to the probability of Admission.
The data set is of the sort,
337 0.92
324 0.76
316 0.72
322 0.8
. . .
My main Program.m file looks like,
% read the data
data = load('Admission_Predict.txt');
% initiate variables
x = data(:,1);
y = data(:,2);
m = length(y);
theta = zeros(2,1);
alpha = 0.01;
iters = 1500;
J_hist = zeros(iters,1);
% plot data
subplot(1,2,1);
plot(x,y,'rx','MarkerSize', 10);
title('training data');
% compute cost function
x = [ones(m,1), (data(:,1) ./ 300)]; % feature scaling
J = computeCost(x,y,theta);
% run gradient descent
[theta, J_hist] = gradientDescent(x,y,theta,alpha,iters);
hold on;
subplot(1,2,1);
plot((x(:,2) .* 300), (x*theta),'-');
xlabel('GRE score');
ylabel('Probability');
hold off;
subplot (1,2,2);
plot(1:iters, J_hist, '-b');
xlabel('no: of iteration');
ylabel('Cost function');
computeCost.m looks like,
function J = computeCost(x,y,theta)
m = length(y);
h = x * theta;
J = (1/(2*m))*sum((h-y) .^ 2);
endfunction
and gradientDescent.m looks like,
function [theta, J_hist] = gradientDescent(x,y,theta,alpha,iters)
m = length(y);
J_hist = zeros(iters,1);
for i=1:iters
diff = (x*theta - y);
theta = theta - (alpha * (1/(m))) * (x' * diff);
J_hist(i) = computeCost(x,y,theta);
endfor
endfunction
The graphs plotted then looks like this,
which you can see, doesn't feel right even though my Cost function seems to be minimized.
Can someone please tell me if this is right? If not, what am I doing wrong?
The easiest way to check whether your implementation is correct is to compare with a validated implementation of linear regression. I suggest using an alternative implementation approach like the one suggested here, and then comparing your results. If the fits match, then this is the best linear fit to your data and if they don't match, then there may be something wrong in your implementation.

Simulate and present normal distribution

My task is to compare different methods of simulating normal distribution. For example, I use following code, to generate 2 vectors, each 1000 values (Box-Muller method):
k=1;
mu=0;
N = 1000;
alpha = rand(1, N);
beta = rand(1, N);
val1 = sqrt(-2 * log(alpha)) .* sin(2 * pi * beta);
val2 = sqrt(-2 * log(alpha)) .* cos(2 * pi * beta);
hist([val1,val2]);
hold on;
%Now I want to make normal distr pdf over hist to see difference
[f,x] = ecdf(mu+sigma*[val1,val2]);
p = normpdf(x,mu, sigma);
plot(x,p*N,'r');
However, it's look very ugly - I can't distinct val1 from val2 and also my pdf doesn't fit histogram well. I think I'm doing something wrong with this pdf, but I don't know what. I found on the Internet different code:
r = rand(1000,2); % 2 cols of uniform rand
%Box-Muller
%n = sqrt(-2*log(r(:,1)))*[1,1].*[cos(2*pi*r(:,2)), sin(2*pi*r(:,2))];
hist(n) % plot two histograms
It looks better, but I don't know how to plot normal distribution pdf over it - method with ecdf cause error.
I'm rather new in Matlab and sometimes I make simple mistakes (like with vector dimensions) but for now I barely can see them.
Can someone help me with above or propose another way to simulate normal random variables and comparision to it (with B-M method or another, just not so complicated)?
I think your plots have different scales, corrected code would look like this:
clear all;
sigma=1; mu=0; N = 1000;
alpha = rand(1, N); beta = rand(1, N);
val1 = sqrt(-2 * log(alpha)) .* sin(2 * pi * beta);
val2 = sqrt(-2 * log(alpha)) .* cos(2 * pi * beta);
vals = [val1,val2];
Nbins = 50; [h,hx] = hist(vals,Nbins);
bar(hx,h*0.5/(hx(2)-hx(1)))
hold on;
%Now I want to make normal distr pdf over hist to see difference
[f,x] = ecdf(mu+sigma*vals);
p = normpdf(x,mu, sigma);
plot(x,p*N,'r');
As mentioned in the comments, quantitative comparison of the distributions requires performing statistical tests (e.g. goodness of fit http://en.wikipedia.org/wiki/Goodness_of_fit)

Batch gradient descent for polynomial regression

I am trying to move on from simple linear single-variable gradient descent into something more advanced: best polynomial fit for a set of points. I created a simple octave test script which allows me to visually set the points in a 2D space, then start the gradient dsecent algorithm and see how it gradually approaches the best fit.
Unfortunately, it doesn't work as good as it did with the simple single-variable linear regression: the results I get ( when I get them ) are inconsistent with the polynome I expect!
Here is the code:
dim=5;
h = figure();
axis([-dim dim -dim dim]);
hold on
index = 1;
data = zeros(1,2);
while(1)
[x,y,b] = ginput(1);
if( length(b) == 0 )
break;
endif
plot(x, y, "b+");
data(index, :) = [x y];
index++;
endwhile
y = data(:, 2);
m = length(y);
X = data(:, 1);
X = [ones(m, 1), data(:,1), data(:,1).^2, data(:,1).^3 ];
theta = zeros(4, 1);
iterations = 100;
alpha = 0.001;
J = zeros(1,iterations);
for iter = 1:iterations
theta -= ( (1/m) * ((X * theta) - y)' * X)' * alpha;
plot(-dim:0.01:dim, theta(1) + (-dim:0.01:dim).*theta(2) + (-dim:0.01:dim).^2.*theta(3) + (-dim:0.01:dim).^3.*theta(4), "g-");
J(iter) = sum( (1/m) * ((X * theta) - y)' * X);
end
plot(-dim:0.01:dim, theta(1) + (-dim:0.01:dim).*theta(2) + (-dim:0.01:dim).^2.*theta(3) + (-dim:0.01:dim).^3.*theta(4), "r-");
figure()
plot(1:iter, J);
I continuously get wrong results, even though it would seem that J is minimized correctly. I checked the plotting function with the normal equation ( which works correctly of course, and although I believe the error lies somewhere in the theta equation, I cannot figure out what it.
i implemented your code and it seems to be just fine, the reason that you do not have the results that you want is that Linear regression or polynomial regression in your case suffers from local minimum when you try to minimize the objective function. The algorithm traps in local minimum during execution. i implement your code changing the step (alpha) and i saw that with smaller step it fits the data better but still you are trapping in local minimum.
Choosing random initialization point of thetas every time i am trapping in a different local minimum. If you are lucky you will find a better initial points for theta and fit the data better. I think that there are some algorithms that finds the best initial points.
Below i attach the results for random initial points and the results with Matlab's polyfit.
In the above plot replace "Linear Regression with Polynomial Regression", type error.
If you observe better the plot, you will see that by chance (using rand() ) i chose some initial points that leaded me to the best data fitting comparing the other initial points.... i am showing that with a pointer.

How do I iterate over a list and pass the funciton result to the next iteration

I'm new to F# and in an effort to learn, thought it would be fun to implement a clustering algorithm.
I have an input list of lists that I need to iterate over. For each of these input vectors I need to apply a function that updates the weights and returns a list of lists (weight matrix). I can do that part via the newMatrix function. The problem is, I need to use the updated weight matrix in the next iteration, and I'm lost as to how to do this. Here's the important parts, some functions left out for brevity.
let inputList = [[1; 1; 0; 0]; [0; 0; 0; 1]; [1; 0; 0; 0]; [0; 0; 1; 1;]]
let weights = [[.2; .6; .5; .9]; [.8; .4; .7; .3]]
let newMatrix xi matrix =
List.map2( fun w wi ->
if wi = (yiIndex xi) then (newWeights xi)
else w) matrix [0..matrix.Length-1]
printfn "%A" (newMatrix inputList.Head weights)
> >
[[0.2; 0.6; 0.5; 0.9]; [0.92; 0.76; 0.28; 0.32]]
So my question is, how do I iterate over inputList calculating newMatrix for each inputVector using the previous newMatrix result?
Edit: added psuedo algorithm:
for input vector 1
given weight matrix calculate new weight matrix
return weight matirx prime
for input vector 2
given weight matrix prime calculate new weight matrix
and so on...
...
Aside: I'm implementing a Kohonen SOM algorithm fom this book.
If you just started learning F#, then it may be useful to try implementing this explicitly using recursion first. As Ankur points out, this particular recursive pattern is captured by List.fold, but it is quite useful to understand how List.fold actually works. So, the explicit version would look like this:
// Takes vectors to be processed and an initial list of weights.
// The result is an adapted list of weights.
let rec processVectors weights vectors =
match vectors with
| [] ->
// If 'vectors' is empty list, we're done and we just return current weights
weights
| head::tail ->
// We got a vector 'head' and remaining vectors 'tail'
// Adapt the weights using the current vector...
let weights2 = newweights weights head
// and then adapt weights using the remaining vectors (recursively)
processVectors weights2 tail
This is essentially what List.fold does, but it may be easier to understand it if you see the code written like this (the List.fold function hides the recursive processing, so the lambda function used as an argument is just the function that calculates new weights).
Aside, I don't quite understand your newMatrix function. Can you give more details about that? Generally, when working with lists you don't need to use indexing and it seems that you're doing something that requires accessing elements at a specific index. There may be a better way to write that....
I guess you are looking for List.fold.
Something like:
let inputList = [[1; 1; 0; 0]; [0; 0; 0; 1]; [1; 0; 0; 0]; [0; 0; 1; 1;]]
let weights = [[0.2; 0.6; 0.5; 0.9]; [0.8; 0.4; 0.7; 0.3]]
let newWeights w values = w //Fake method which returns old weight as it is
inputList |> List.fold (newWeights) weights
NOTE: The newWeights function in this case is taking weights and input vector and returns new weights
Or may be a List.scan in case you also need the intermediate calculated weights
let inputList = [[1; 1; 0; 0]; [0; 0; 0; 1]; [1; 0; 0; 0]; [0; 0; 1; 1;]]
let weights = [[0.2; 0.6; 0.5; 0.9]; [0.8; 0.4; 0.7; 0.3]]
let newWeights w values = w
inputList |> List.scan (newWeights) weights

Laplacian of gaussian filter use

This is a formula for LoG filtering:
(source: ed.ac.uk)
Also in applications with LoG filtering I see that function is called with only one parameter:
sigma(σ).
I want to try LoG filtering using that formula (previous attempt was by gaussian filter and then laplacian filter with some filter-window size )
But looking at that formula I can't understand how the size of filter is connected with this formula, does it mean that the filter size is fixed?
Can you explain how to use it?
As you've probably figured out by now from the other answers and links, LoG filter detects edges and lines in the image. What is still missing is an explanation of what σ is.
σ is the scale of the filter. Is a one-pixel-wide line a line or noise? Is a line 6 pixels wide a line or an object with two distinct parallel edges? Is a gradient that changes from black to white across 6 or 8 pixels an edge or just a gradient? It's something you have to decide, and the value of σ reflects your decision — the larger σ is the wider are the lines, the smoother the edges, and more noise is ignored.
Do not get confused between the scale of the filter (σ) and the size of the discrete approximation (usually called stencil). In Paul's link σ=1.4 and the stencil size is 9. While it is usually reasonable to use stencil size of 4σ to 6σ, these two quantities are quite independent. A larger stencil provides better approximation of the filter, but in most cases you don't need a very good approximation.
This was something that confused me too, and it wasn't until I had to do the same as you for a uni project that I understood what you were supposed to do with the formula!
You can use this formula to generate a discrete LoG filter. If you write a bit of code to implement that formula, you can then to generate a filter for use in image convolution. To generate, say a 5x5 template, simply call the code with x and y ranging from -2 to +2.
This will generate the values to use in a LoG template. If you graph the values this produces you should see the "mexican hat" shape typical of this filter, like so:
(source: ed.ac.uk)
You can fine tune the template by changing how wide it is (the size) and the sigma value (how broad the peak is). The wider and broader the template the less affected by noise the result will be because it will operate over a wider area.
Once you have the filter, you can apply it to the image by convolving the template with the image. If you've not done this before, check out these few tutorials.
java applet tutorials more mathsy.
Essentially, at each pixel location, you "place" your convolution template, centred at that pixel. You then multiply the surrounding pixel values by the corresponding "pixel" in the template and add up the result. This is then the new pixel value at that location (typically you also have to normalise (scale) the output to bring it back into the correct value range).
The code below gives a rough idea of how you might implement this. Please forgive any mistakes / typos etc. as it hasn't been tested.
I hope this helps.
private float LoG(float x, float y, float sigma)
{
// implement formula here
return (1 / (Math.PI * sigma*sigma*sigma*sigma)) * //etc etc - also, can't remember the code for "to the power of" off hand
}
private void GenerateTemplate(int templateSize, float sigma)
{
// Make sure it's an odd number for convenience
if(templateSize % 2 == 1)
{
// Create the data array
float[][] template = new float[templateSize][templatesize];
// Work out the "min and max" values. Log is centered around 0, 0
// so, for a size 5 template (say) we want to get the values from
// -2 to +2, ie: -2, -1, 0, +1, +2 and feed those into the formula.
int min = Math.Ceil(-templateSize / 2) - 1;
int max = Math.Floor(templateSize / 2) + 1;
// We also need a count to index into the data array...
int xCount = 0;
int yCount = 0;
for(int x = min; x <= max; ++x)
{
for(int y = min; y <= max; ++y)
{
// Get the LoG value for this (x,y) pair
template[xCount][yCount] = LoG(x, y, sigma);
++yCount;
}
++xCount;
}
}
}
Just for visualization purposes, here is a simple Matlab 3D colored plot of the Laplacian of Gaussian (Mexican Hat) wavelet. You can change the sigma(σ) parameter and see its effect on the shape of the graph:
sigmaSq = 0.5 % Square of σ parameter
[x y] = meshgrid(linspace(-3,3), linspace(-3,3));
z = (-1/(pi*(sigmaSq^2))) .* (1-((x.^2+y.^2)/(2*sigmaSq))) .*exp(-(x.^2+y.^2)/(2*sigmaSq));
surf(x,y,z)
You could also compare the effects of the sigma parameter on the Mexican Hat doing the following:
t = -5:0.01:5;
sigma = 0.5;
mexhat05 = exp(-t.*t/(2*sigma*sigma)) * 2 .*(t.*t/(sigma*sigma) - 1) / (pi^(1/4)*sqrt(3*sigma));
sigma = 1;
mexhat1 = exp(-t.*t/(2*sigma*sigma)) * 2 .*(t.*t/(sigma*sigma) - 1) / (pi^(1/4)*sqrt(3*sigma));
sigma = 2;
mexhat2 = exp(-t.*t/(2*sigma*sigma)) * 2 .*(t.*t/(sigma*sigma) - 1) / (pi^(1/4)*sqrt(3*sigma));
plot(t, mexhat05, 'r', ...
t, mexhat1, 'b', ...
t, mexhat2, 'g');
Or simply use the Wavelet toolbox provided by Matlab as follows:
lb = -5; ub = 5; n = 1000;
[psi,x] = mexihat(lb,ub,n);
plot(x,psi), title('Mexican hat wavelet')
I found this useful when implementing this for edge detection in computer vision. Although not the exact answer, hope this helps.
It appears to be a continuous circular filter whose radius is sqrt(2) * sigma. If you want to implement this for image processing you'll need to approximate it.
There's an example for sigma = 1.4 here: http://homepages.inf.ed.ac.uk/rbf/HIPR2/log.htm

Resources