How do I iterate over a list and pass the funciton result to the next iteration - f#

I'm new to F# and in an effort to learn, thought it would be fun to implement a clustering algorithm.
I have an input list of lists that I need to iterate over. For each of these input vectors I need to apply a function that updates the weights and returns a list of lists (weight matrix). I can do that part via the newMatrix function. The problem is, I need to use the updated weight matrix in the next iteration, and I'm lost as to how to do this. Here's the important parts, some functions left out for brevity.
let inputList = [[1; 1; 0; 0]; [0; 0; 0; 1]; [1; 0; 0; 0]; [0; 0; 1; 1;]]
let weights = [[.2; .6; .5; .9]; [.8; .4; .7; .3]]
let newMatrix xi matrix =
List.map2( fun w wi ->
if wi = (yiIndex xi) then (newWeights xi)
else w) matrix [0..matrix.Length-1]
printfn "%A" (newMatrix inputList.Head weights)
[[0.2; 0.6; 0.5; 0.9]; [0.92; 0.76; 0.28; 0.32]]
So my question is, how do I iterate over inputList calculating newMatrix for each inputVector using the previous newMatrix result?
Edit: added psuedo algorithm:
for input vector 1
given weight matrix calculate new weight matrix
return weight matirx prime
for input vector 2
given weight matrix prime calculate new weight matrix
and so on...
Aside: I'm implementing a Kohonen SOM algorithm fom this book.

If you just started learning F#, then it may be useful to try implementing this explicitly using recursion first. As Ankur points out, this particular recursive pattern is captured by List.fold, but it is quite useful to understand how List.fold actually works. So, the explicit version would look like this:
// Takes vectors to be processed and an initial list of weights.
// The result is an adapted list of weights.
let rec processVectors weights vectors =
match vectors with
| [] ->
// If 'vectors' is empty list, we're done and we just return current weights
| head::tail ->
// We got a vector 'head' and remaining vectors 'tail'
// Adapt the weights using the current vector...
let weights2 = newweights weights head
// and then adapt weights using the remaining vectors (recursively)
processVectors weights2 tail
This is essentially what List.fold does, but it may be easier to understand it if you see the code written like this (the List.fold function hides the recursive processing, so the lambda function used as an argument is just the function that calculates new weights).
Aside, I don't quite understand your newMatrix function. Can you give more details about that? Generally, when working with lists you don't need to use indexing and it seems that you're doing something that requires accessing elements at a specific index. There may be a better way to write that....

I guess you are looking for List.fold.
Something like:
let inputList = [[1; 1; 0; 0]; [0; 0; 0; 1]; [1; 0; 0; 0]; [0; 0; 1; 1;]]
let weights = [[0.2; 0.6; 0.5; 0.9]; [0.8; 0.4; 0.7; 0.3]]
let newWeights w values = w //Fake method which returns old weight as it is
inputList |> List.fold (newWeights) weights
NOTE: The newWeights function in this case is taking weights and input vector and returns new weights
Or may be a List.scan in case you also need the intermediate calculated weights
let inputList = [[1; 1; 0; 0]; [0; 0; 0; 1]; [1; 0; 0; 0]; [0; 0; 1; 1;]]
let weights = [[0.2; 0.6; 0.5; 0.9]; [0.8; 0.4; 0.7; 0.3]]
let newWeights w values = w
inputList |> List.scan (newWeights) weights


Vectorize loop dependent on its previous state

We'll take this code:
n = 30;
x = ones(1, n);
for i=1:n
The equation is just an example.
Is it possible to vectorize it?

Vectorization issue

Say you have two column vectors vv and ww, each with 7 elements (i.e., they have dimensions 7x1). Consider the following code:
z = 0;
for i = 1:7
z = z + v(i) * w(i)
A) z = sum (v .* w);
B) z = w' * v;
C) z = v * w;
D) z = w * v;
According to the solutions, answers (A) AND (B) are the right answers, can someone please help me understand why?
Why is z = v * w' which is similar to answer (B) but only the order of the operation changes, is false? Since we want a vector that by definition only has one column, wouldn't we need a matrix of this size: 1x7 * 7x1 = 1x1 ? So why is z = v' * w false ? It gives the same dimension as answer (B)?
z = v'*w is true and is equal to w'*v.
They both makes 1*1 matrix, which is a number value in octave.
See this:
octave:5> v = rand(7, 1);
octave:6> w = rand(7, 1);
octave:7> v'*w
ans = 1.3110
octave:8> w'*v
ans = 1.3110
octave:9> sum(v.*w)
ans = 1.3110
Answers A and B both perform a dot product of the two vectors, which yields the same result as the code provided. Answer A first performs the element-wise product (.*) of the two column vectors, then sums those intermediate values. Answer B performs the same mathematical operation but does so via a dot product (i.e., matrix multiplication).
Answer C is incorrect because it would be performing a matrix multiplication on misaligned matrices (7x1 and 7x1). The same is true for D.
z = v * w', which was not one of the options, is incorrect because it would yield a 7x7 matrix (instead of the 1x1 scalar value desired). The point is that order matters when performing matrix multiplication. (1xN)X(Nx1) -> (1x1), whereas (Nx1)X(1xN) -> (NxN).
z = v' * w is actually a correct solution but was simply not provided as one of the options.

Linear Regression - Implementing Feature Scaling

I was trying to implement Linear Regression in Octave 5.1.0 on a data set relating the GRE score to the probability of Admission.
The data set is of the sort,
337 0.92
324 0.76
316 0.72
322 0.8
. . .
My main Program.m file looks like,
% read the data
data = load('Admission_Predict.txt');
% initiate variables
x = data(:,1);
y = data(:,2);
m = length(y);
theta = zeros(2,1);
alpha = 0.01;
iters = 1500;
J_hist = zeros(iters,1);
% plot data
plot(x,y,'rx','MarkerSize', 10);
title('training data');
% compute cost function
x = [ones(m,1), (data(:,1) ./ 300)]; % feature scaling
J = computeCost(x,y,theta);
% run gradient descent
[theta, J_hist] = gradientDescent(x,y,theta,alpha,iters);
hold on;
plot((x(:,2) .* 300), (x*theta),'-');
xlabel('GRE score');
hold off;
subplot (1,2,2);
plot(1:iters, J_hist, '-b');
xlabel('no: of iteration');
ylabel('Cost function');
computeCost.m looks like,
function J = computeCost(x,y,theta)
m = length(y);
h = x * theta;
J = (1/(2*m))*sum((h-y) .^ 2);
and gradientDescent.m looks like,
function [theta, J_hist] = gradientDescent(x,y,theta,alpha,iters)
m = length(y);
J_hist = zeros(iters,1);
for i=1:iters
diff = (x*theta - y);
theta = theta - (alpha * (1/(m))) * (x' * diff);
J_hist(i) = computeCost(x,y,theta);
The graphs plotted then looks like this,
which you can see, doesn't feel right even though my Cost function seems to be minimized.
Can someone please tell me if this is right? If not, what am I doing wrong?
The easiest way to check whether your implementation is correct is to compare with a validated implementation of linear regression. I suggest using an alternative implementation approach like the one suggested here, and then comparing your results. If the fits match, then this is the best linear fit to your data and if they don't match, then there may be something wrong in your implementation.

Obtain sigma of gaussian blur between two images

Suppose I have an image A, I applied Gaussian Blur on it with Sigam=3 So I got another Image B. Is there a way to know the applied sigma if A,B is given?
Further clarification:
Image A:
Image B:
I want to write a function that take A,B and return Sigma:
double get_sigma(cv::Mat const& A,cv::Mat const& B);
Any suggestions?
EDIT1: The suggested approach doesn't work in practice in its original form(i.e. using only 9 equations for a 3 x 3 kernel), and I realized this later. See EDIT1 below for an explanation and EDIT2 for a method that works.
EDIT2: As suggested by Humam, I used the Least Squares Estimate (LSE) to find the coefficients.
I think you can estimate the filter kernel by solving a linear system of equations in this case. A linear filter weighs the pixels in a window by its coefficients, then take their sum and assign this value to the center pixel of the window in the result image. So, for a 3 x 3 filter like
the resulting pixel value in the filtered image
result_pix_value = h11 * a(y, x) + h12 * a(y, x+1) + h13 * a(y, x+2) +
h21 * a(y+1, x) + h22 * a(y+1, x+1) + h23 * a(y+1, x+2) +
h31 * a(y+2, x) + h32 * a(y+2, x+1) + h33 * a(y+2, x+2)
where a's are the pixel values within the window in the original image. Here, for the 3 x 3 filter you have 9 unknowns, so you need 9 equations. You can obtain those 9 equations using 9 pixels in the resulting image. Then you can form an Ax = b system and solve for x to obtain the filter coefficients. With the coefficients available, I think you can find the sigma.
In the following example I'm using non-overlapping windows as shown to obtain the equations.
You don't have to know the size of the filter. If you use a larger size, the coefficients that are not relevant will be close to zero.
Your result image size is different than the input image, so i didn't use that image for following calculation. I use your input image and apply my own filter.
I tested this in Octave. You can quickly run it if you have Octave/Matlab. For Octave, you need to load the image package.
I'm using the following kernel to blur the image:
h =
0.10963 0.11184 0.10963
0.11184 0.11410 0.11184
0.10963 0.11184 0.10963
When I estimate it using a window size 5, I get the following. As I said, the coefficients that are not relevant are close to zero.
g =
9.5787e-015 -3.1508e-014 1.2974e-015 -3.4897e-015 1.2739e-014
-3.7248e-014 1.0963e-001 1.1184e-001 1.0963e-001 1.8418e-015
4.1825e-014 1.1184e-001 1.1410e-001 1.1184e-001 -7.3554e-014
-2.4861e-014 1.0963e-001 1.1184e-001 1.0963e-001 9.7664e-014
1.3692e-014 4.6182e-016 -2.9215e-014 3.1305e-014 -4.4875e-014
First of all, my apologies.
This approach doesn't really work in the practice. I've used the filt = conv2(a, h, 'same'); in the code. The resulting image data type in this case is double, whereas in the actual image the data type is usually uint8, so there's loss of information, which we can think of as noise. I simulated this with the minor modification filt = floor(conv2(a, h, 'same'));, and then I don't get the expected results.
The sampling approach is not ideal, because it's possible that it results in a degenerated system. Better approach is to use random sampling, avoiding the borders and making sure the entries in the b vector are unique. In the ideal case, as in my code, we are making sure the system Ax = b has a unique solution this way.
One approach would be to reformulate this as Mv = 0 system and try to minimize the squared norm of Mv under the constraint squared-norm v = 1, which we can solve using SVD. I could be wrong here, and I haven't tried this.
Another approach is to use the symmetry of the Gaussian kernel. Then a 3x3 kernel will have only 3 unknowns instead of 9. I think, this way we impose additional constraints on v of the above paragraph.
I'll try these out and post the results, even if I don't get the expected results.
Using the LSE, we can find the filter coefficients as pinv(A'A)A'b. For completion, I'm adding a simple (and slow) LSE code.
Initial Octave Code:
clear all
im = double(imread('I2vxD.png'));
k = 5;
r = floor(k/2);
a = im(:, :, 1); % take the red channel
h = fspecial('gaussian', [3 3], 5); % filter with a 3x3 gaussian
filt = conv2(a, h, 'same');
% use non-overlapping windows to for the Ax = b syatem
% NOTE: boundry error checking isn't performed in the code below
s = floor(size(a)/2);
y = s(1);
x = s(2);
w = k*k;
y1 = s(1)-floor(w/2) + r;
y2 = s(1)+floor(w/2);
x1 = s(2)-floor(w/2) + r;
x2 = s(2)+floor(w/2);
b = [];
A = [];
for y = y1:k:y2
for x = x1:k:x2
b = [b; filt(y, x)];
f = a(y-r:y+r, x-r:x+r);
A = [A; f(:)'];
% estimated filter kernel
g = reshape(A\b, k, k)
LSE method:
clear all
im = double(imread('I2vxD.png'));
k = 5;
r = floor(k/2);
a = im(:, :, 1); % take the red channel
h = fspecial('gaussian', [3 3], 5); % filter with a 3x3 gaussian
filt = floor(conv2(a, h, 'same'));
s = size(a);
y1 = r+2; y2 = s(1)-r-2;
x1 = r+2; x2 = s(2)-r-2;
b = [];
A = [];
for y = y1:2:y2
for x = x1:2:x2
b = [b; filt(y, x)];
f = a(y-r:y+r, x-r:x+r);
f = f(:)';
A = [A; f];
g = reshape(A\b, k, k) % A\b returns the least squares solution
%g = reshape(pinv(A'*A)*A'*b, k, k)

which filter is being use by get_convolve() function in CImg library

Which kind of filter is being used by CImg library's get_convolve() function(written in C)? Median or Gaussian or bilateral or some other?
I tried to understand the function so that I can use the similar functionality in PIL openCV. In the header file CImg.h of the library, it says:
Compute the convolution of the image by a mask.
The result \p res of the convolution of an image \p img by a mask \p mask is defined to be :
res(x,y,z) = sum_{i,j,k} img(x-i,y-j,z-k)*mask(i,j,k)
param mask = the correlation kernel.
param cond = the border condition type (0=zero, 1=dirichlet)
param weighted_convol = enable local normalization.
Declaration is like this:
template<typename t> CImg<typename cimg::superset2<T,t,float>::type>
get_convolve(const CImg<t>& mask, const unsigned int cond=1, const bool weighted_convol=false) const {}
Here is a code sniplet:
for (int z = mz1; z<mze; ++z)
for (int y = my1; y<mye; ++y)
for (int x = mx1; x<mxe; ++x) {// For each pixel
Ttfloat val = 0;
for (int zm = -mz1; zm<=mz2; ++zm)
for (int ym = -my1; ym<=my2; ++ym)
for (int xm = -mx1; xm<=mx2; ++xm)
dest(x,y,z,v) = (Ttfloat)val;
if (cond)
for (int x = 0; x<dimx(); (y<my1 || y>=mye || z<mz1 || z>=mze)?++x:((x<mx1-1 || x>=mxe)?++x:(x=mxe))) {
Ttfloat val = 0;
for (int zm = -mz1; zm<=mz2; ++zm) for (int ym = -my1; ym<=my2; ++ym) for (int xm = -mx1; xm<=mx2; ++xm)
dest(x,y,z,v) = (Ttfloat)val;
I am using the mask of 7 x 7 and each of the values inside it is '1'.
What I got from the function was that for each pixel, it is taking a 7 by 7 window around it, with the pixel at center and then multiplying with he Identity matrix. It feels like some kind of smoothing filter but which one is it? Which equivalent filter can I use in openCV?
I can post the whole function, but its too long and I don't see the point. I would be really thankful for your help.
So, I found the answer in the thesis of the person who implemented pHash. It said:
During the process of calculating pHash, a mean filter is applied to the image. A ker-
nel with dimension 7x7 is used. To apply this kernel, the get_convolve()
function of the CImg library is used. It is then highlighted as:
For an image I and a mask M it is:
R(x,y,z) = SIGMA(i,j,k) I(x − i, y − j, z − k)M (i, j, k)
Then when I looked at the type of filtering functions offered by openCV here, it matched with the box filter function.
