FORTRAN counter loop returns multiple iterations of the same value - gfortran

First of all I am a complete novice to FORTRAN. With that said I am attempting to "build" a box, then randomly generate x, y, z coordinates for 100 atoms. From there, the goal is to calculate the distance between each atom, which becomes the value "r" of the Lennard-Jones potential energy equation. Then calculate the LJ potential, and finally sum the potential of the entire box. A previous question that I had asked about this project is here. The problem is that I get the same calculated value over and over and over again. My code is below.
program energytot
implicit none
integer, parameter :: n = 100
integer :: i, j, k, seed(12)
double precision :: sigma, r, epsilon, lx, ly, lz
double precision, dimension(n) :: x, y, z, cx, cy, cz
double precision, dimension(n*(n+1)/2) :: dx, dy, dz, LJx, LJy, LJz
sigma = 4.1
epsilon = 1.7
!Box length with respect to the axis
lx = 15
ly = 15
lz = 15
do i=1,12
seed(i)=i+3
end do
!generate n random numbers for x, y, z
call RANDOM_SEED(PUT = seed)
call random_number(x)
call random_number(y)
call random_number(z)
!convert random numbers into x, y, z coordinates
cx = ((2*x)-1)*(lx*0.5)
cy = ((2*y)-1)*(lx*0.5)
cz = ((2*z)-1)*(lz*0.5)
do j=1,n-1
do k=j+1,n
dx = ABS((cx(j) - cx(k)))
LJx = 4 * epsilon * ((sigma/dx(j))**12 - (sigma/dx(j))**6)
dy = ABS((cy(j) - cy(k)))
LJy = 4 * epsilon * ((sigma/dy(j))**12 - (sigma/dy(j))**6)
dz = ABS((cz(j) - cz(k)))
LJz = 4 * epsilon * ((sigma/dz(j))**12 - (sigma/dz(j))**6)
end do
end do
print*, dx
end program energytot

What exactly is your question? What do you want your code to do, and what does it do instead?
If you're having problems with the final print statement print*, dx, try this instead:
print *, 'dx = '
do i = 1, n * (n + 1) / 2
print *, dx(i)
end do
It seems that dx is too big to be printed without a loop.
Also, it looks like you're repeatedly assigning the array dx (and other arrays in the loop) to a single value. Try this instead:
i = 0
do j=1,n-1
do k=j+1,n
i = i + 1
dx(i) = ABS((cx(j) - cx(k)))
end do
end do
This way, each value cx(j) - cx(k) gets saved to a different element of dx, instead of overwriting previously saved values.

My new code goes something like this:
program energytot
implicit none
integer, parameter :: n = 6
integer :: i, j, k, seed(12)
double precision :: sigma, r, epsilon, lx, ly, lz, etot, pot, rx, ry, rz
double precision, dimension(n) :: x, y, z, cx, cy, cz
sigma = 4.1
epsilon = 1.7
etot=0
!Box length with respect to the axis
lx = 15
ly = 15
lz = 15
do i=1,12
seed(i)=i+90
end do
!generate n random numbers for x, y, z
call RANDOM_SEED(PUT = seed)
call random_number(x)
call random_number(y)
call random_number(z)
!convert random numbers into x, y, z coordinates
cx = ((2*x)-1)*(lx*0.5)
cy = ((2*y)-1)*(lx*0.5)
cz = ((2*z)-1)*(lz*0.5)
do j=1,n-1
do k=j+1,n
rx = (cx(j) - cx(k))
ry = (cy(j) - cy(k))
rz = (cz(j) - cz(k))
!Apply minimum image convention
rx=rx-lx*anint(rx/lx)
ry=ry-ly*anint(ry/ly)
rz=rz-lz*anint(rz/lz)
r=sqrt(rx**2+ry**2+rz**2)
pot=4 * epsilon * ((sigma/r)**12 - (sigma/r)**6)
print*,pot
etot=etot+pot
end do
end do
print*, etot
end program energytot

Related

linear regression with one variable Gradient descent

I want to ask how this equation
can be written at octave by this way
predictions = X * theta;
delta = (1/m) * X' * (predictions - y);
theta = theta - alpha * delta;
I dont understand from where transpose come and how this equation converted to ve by this way?
The scalar product X.Y is mathematically sum (xi * yi) and can be written as X' * Y in octave when X and Y are vectors.
There are other ways to write a scalar product in octave, cf
https://octave.sourceforge.io/octave/function/dot.html
The question seems to be, given an example where:
X = randn(m, k); % m 'input' horizontal-vectors of dimensionality k
y = randn(m, n); % m 'target' horizontal-vectors of dimensionality n
theta = randn(k, n); % a (right) transformation from k to n dimensional
% horizontal-vectors
h = X * theta; % creates m rows of n-dimensional horizontal vectors
how is it that the following code
delta = zeros(k,n)
for j = 1 : k % iterating over all dimensions of the input
for l = 1 : n % iterating over all dimensions of the output
for i = 1 : m % iterating over all observations for that j,l pair
delta(j, l) += (1/m) * (h(i, l) - y(i, l)) * x(i,j);
end
theta(j, l) = theta(j, l) - alpha * delta(j, l);
end
end
can be vectorised as:
h = X * theta ;
delta = (1/ m) * X' * (h - y);
theta = theta - alpha * delta;
To confirm such a vectorised formulation makes sense, it always helps to note (e.g. below each line) the dimensions of the objects involved in the matrix / vectorised operations:
h = X * theta ;
% [m, n] [m, k] [k, n]
delta = (1/ m) * X' * (h - y);
% [k, n] [1, 1] [k, m] [m, n]
theta = theta - alpha * delta;
% [k, n] [k,n] [1, 1] [k, n]
Hopefully now it will become more obvious that they are equivalent.
W.r.t the X' * D calculation (where D = predictions - y) you can see that:
performing matrix multiplication with the 1st row of X' and the 1st column of D is equal to summing for k=1 and n=1 over all m observations, and placing that result at position [k=1, n=1] in the resulting matrix output. Then moving along the columns of D and still multiplying by the 1st row of X', you can see that we are simply moving along the n dimensions in D, and placing the result accordingly in the output. Similarly, moving along the rows of X', you move along the k dimensions of X', performing the same process for all n in that D, and placing the results accordingly, until you've finished matrix multiplications over all rows of X and columns in D.
If you follow the logic above, you will see that the summations involved are exactly the same as in the for loop formulation, but we managed to avoid using a for loop and use matrix operations instead.

F# arrays to 2D histogram

I’m using OxyPlot with F#. I have code to create a single parameter histogram and plot it. My code for dual parameter histograms in the form of a contour is too time consuming. I’d like an efficient way to map two vectors or arrays into a 2D histogram. I’m including my code for regular histogram.
let myHistogram c =
flatten dataArray.[c..c,*]
|> Seq.toArray
|> Array.map (fun x -> round(float(x)/16.0))
|> Seq.countBy (fun x -> x)
|> Seq.sort
|> Seq.map snd
So, I’m looking to take dataArray.[a…a,], dataArray[b…b,] and place them into bins of a specific resolution to create histogram[x,y]. OxyPlot needs the histogram in order to create a contour.
Imagine two arrays of data with one being called Alexa647-H and the other BV786-H. Each array contains 100,000 integers ranging between 0 and 10,000. You could plot these arrays as a dot plot in OxyPlot. That is straight forward, simply plot one array for the X-Axis and one array for the Y-Axis. I've included a plot below.
My question involves creating a contour plot out of the same data. For that, I need to first determine a resolution, say for convenience 100x100. Therefore I want to end up with a 2D array call hist2(100,100). The array is basically 10,000 bins of 1000x1000 in size. Each bin contains the count of elements which fall into a particular range -- a 2D histogram.
Dot and Contour
The coding example in OxyPlot generates a peak array mathematically. I want to generate that contour input peak array as outline above, instead.
var model = new PlotModel { Title = "ContourSeries" };
double x0 = -3.1;
double x1 = 3.1;
double y0 = -3;
double y1 = 3;
//generate values
Func<double, double, double> peaks = (x, y) => 3 * (1 - x) * (1 - x) * Math.Exp(-(x * x) - (y + 1) * (y + 1)) - 10 * (x / 5 - x * x * x - y * y * y * y * y) * Math.Exp(-x * x - y * y) - 1.0 / 3 * Math.Exp(-(x + 1) * (x + 1) - y * y);
var xx = ArrayBuilder.CreateVector(x0, x1, 100);
var yy = ArrayBuilder.CreateVector(y0, y1, 100);
var peaksData = ArrayBuilder.Evaluate(peaks, xx, yy);
var cs = new ContourSeries
{
Color = OxyColors.Black,
LabelBackground = OxyColors.White,
ColumnCoordinates = yy,
RowCoordinates = xx,
Data = peaksData
};
model.Series.Add(cs);
Plot generated by OxyPlot code
I hope this clears things up.
Don

Correlation (offset detection) issues - Signal power concentrated at edge of domain

I'm in a bit of a bind - I am in too deep to quickly apply another technique, so here goes nothing...
I'm doing line tracking by correlating each row of a matrix with the row below and taking the max of the correlation to compute the offset. It works extremely well EXCEPT when the signals are up against the edge of the domain. It simply gives a 0. I suspect this is is because it is advantageous to simply add in place rather than shift in 0's to the edge. Here are some example signals that cause the issue. These signals aren't zero-mean, but they are when I correlate (I subtract the mean). I get the correct offset for the third image, but not for the first two.
Here is my correlation code
x0 -= mean(x0)
x1 -= mean(x1)
x0 /= max(x0)
x1 /= max(x1)
c = signal.correlate(x1, x0, mode='full')
m = interp_peak_offset(c)
foffset =(m - len(x0) + 1) * (f[2] - f[1])
I have tried clipping one of the signals by 20 samples on each side, correlating the gradient of the signal, and some other wonky methods with no success...
Any help is greatly appreciated! Thanks so much!
Instead of looking for the maximum amplitude, you should look for phase difference.
This can be achieved using the PHAT ( Phase Transform) method:
def PHAT(x, y, fs, nperseg=50):
f, pxy = csd(x, y, fs=1.0, nperseg=nperseg, return_onesided=False)
pxy_phase = np.divide(pxy, np.abs(pxy))
gcc_fun = np.real(ifft(pxy_phase)) # generelized cross correlation.
TDOA = np.argmax(gcc_fun) / float(fs)
return TDOA
I ended up minimizing the average absolute difference between the two vectors. For each time shift, I computed the absolute difference/number of points of overlap. Here is my function that does so
def offset_using_diff(x0, x1, f):
#Finds the offset of x0 from x1 such that x0(f) ~ x1(f - foffset). Does so by
#minimizing the average absolute difference between the two signals, with one signal
#shifted.
#In other words, we minimize |x0 - x1|/N where N is the number of points overlapping
#between x1 and the shifted version of x0
#Args:
# x0,x1 (vector): data
# f (vector): frequency vector
#Returns:
# foffset (float): frequency offset
OMAX = min(len(x0) // 2, 100) # max offset in samples
dvec = zeros((2 * OMAX,))
offsetvec = arange(-OMAX + 1, OMAX + 1)
y0 = x0.copy()
y1 = x1.copy()
y0 -= min(y0)
y1 -= min(y1)
y0 = pad(y0, (100, 100), 'constant', constant_values=(0, 0))
y1 = pad(y1, (100, 100), 'constant', constant_values=(0, 0))
for i, offset in enumerate(offsetvec):
d0 = roll(y0, offset)
d1 = y1
iinds1 = d0 != 0
iinds2 = d1 != 0
iinds = logical_and(iinds1, iinds2)
d0 = d0[iinds]
d1 = d1[iinds]
diff = d0 - d1
dvec[i] = sum(abs(diff))/len(d0)
m = interp_peak_offset(-1*dvec)
foffset = (m - OMAX + 1)*(f[2]-f[1])
return foffset

In case of logistic regression, how should I interpret this learning curve between cost and number of examples?

I have obtained the following learning curve on plotting the learning curves for training and cross validation sets between the error cost, and number of training examples (in 100s in the graph). Can someone please tell me if this learning curve is ever possible? Because I am of the impression that the Cross validation error should decrease as the number of training examples increase.
Learning Curve. Note that the x axis denotes the number of training examples in 100s.
EDIT :
This is the code which I use to calculate the 9 values for plotting the learning curves.
X is the 2D matrix of the training set examples. It is of dimensions m x (n+1). y is of dimensions m x 1, and each element has value 1 or 0.
for j=1:9
disp(j)
[theta,J] = trainClassifier(X(1:(j*100),:),y(1:(j*100)),lambda);
[error_train(j), grad] = costprediciton_train(theta , X(1:(j*100),:), y(1:(j*100)));
[error_cv(j), grad] = costfunction_test2(theta , Xcv(1:(j*100),:),ycv(1:(j*100)));
end
The code I use for finding the optimal value of Theta from the training set.
% Train the classifer. Return theta
function [optTheta, J] = trainClassifier(X,y,lambda)
[m,n]=size(X);
initialTheta = zeros(n, 1);
options=optimset('GradObj','on','MaxIter',100);
[optTheta, J, Exit_flag ] = fminunc(#(t)(regularizedCostFunction(t, X, y, lambda)), initialTheta, options);
end
%regularized cost
function [J, grad] = regularizedCostFunction(theta, X, y,lambda)
[m,n]=size(X);
h=sigmoid( X * theta);
temp1 = -1 * (y .* log(h));
temp2 = (1 - y) .* log(1 - h);
thetaT = theta;
thetaT(1) = 0;
correction = sum(thetaT .^ 2) * (lambda / (2 * m));
J = sum(temp1 - temp2) / m + correction;
grad = (X' * (h - y)) * (1/m) + thetaT * (lambda / m);
end
The code I use for calculating the error cost for prediction of results for training set: (similar is the code for error cost of CV set)
Theta is of dimensions (n+1) x 1 and consists of the coefficients of the features in the hypothesis function.
function [J,grad] = costprediciton_train(theta , X, y)
[m,n]=size(X);
h=sigmoid(X * theta);
temp1 = y .* log(h);
temp2 = (1-y) .* log(1- h);
J = -sum (temp1 + temp2)/m;
t=h-y;
grad=(X'*t)*(1/m);
end
function [J,grad] = costfunction_test2(theta , X, y)
m= length(y);
h=sigmoid(X*theta);
temp1 = y .* log(h);
temp2 = (1-y) .* log(1- h);
J = -sum (temp1 + temp2)/m ;
grad = (X' * (h - y)) * (1/m) ;
end
The Sigmoid function:
function g = sigmoid(z)
g= zeros(size(z));
den=1 + exp(-1*z);
g = 1 ./ den;
end

Cost Function, Linear Regression, trying to avoid hard coding theta. Octave.

I'm in the second week of Professor Andrew Ng's Machine Learning course through Coursera. We're working on linear regression and right now I'm dealing with coding the cost function.
The code I've written solves the problem correctly but does not pass the submission process and fails the unit test because I have hard coded the values of theta and not allowed for more than two values for theta.
Here's the code I've got so far
function J = computeCost(X, y, theta)
m = length(y);
J = 0;
for i = 1:m,
h = theta(1) + theta(2) * X(i)
a = h - y(i);
b = a^2;
J = J + b;
end;
J = J * (1 / (2 * m));
end
the unit test is
computeCost( [1 2 3; 1 3 4; 1 4 5; 1 5 6], [7;6;5;4], [0.1;0.2;0.3])
and should produce ans = 7.0175
So I need to add another for loop to iterate over theta, therefore allowing for any number of values for theta, but I'll be damned if I can wrap my head around how/where.
Can anyone suggest a way I can allow for any number of values for theta within this function?
If you need more information to understand what I'm trying to ask, I will try my best to provide it.
You can use vectorize of operations in Octave/Matlab.
Iterate over entire vector - it is really bad idea, if your programm language let you vectorize operations.
R, Octave, Matlab, Python (numpy) allow this operation.
For example, you can get scalar production, if theta = (t0, t1, t2, t3) and X = (x0, x1, x2, x3) in the next way:
theta * X' = (t0, t1, t2, t3) * (x0, x1, x2, x3)' = t0*x0 + t1*x1 + t2*x2 + t3*x3
Result will be scalar.
For example, you can vectorize h in your code in the next way:
H = (theta'*X')';
S = sum((H - y) .^ 2);
J = S / (2*m);
Above answer is perfect but you can also do
H = (X*theta);
S = sum((H - y) .^ 2);
J = S / (2*m);
Rather than computing
(theta' * X')'
and then taking the transpose you can directly calculate
(X * theta)
It works perfectly.
The below line return the required 32.07 cost value while we run computeCost once using θ initialized to zeros:
J = (1/(2*m)) * (sum(((X * theta) - y).^2));
and is similar to the original formulas that is given below.
It can be also done in a line-
m- # training sets
J=(1/(2*m)) * ((((X * theta) - y).^2)'* ones(m,1));
J = sum(((X*theta)-y).^2)/(2*m);
ans = 32.073
Above answer is perfect,I thought the problem deeply for a day and still unfamiliar with Octave,so,Just study together!
If you want to use only matrix, so:
temp = (X * theta - y); % h(x) - y
J = ((temp')*temp)/(2 * m);
clear temp;
This would work just fine for you -
J = sum((X*theta - y).^2)*(1/(2*m))
This directly follows from the Cost Function Equation
Python code for the same :
def computeCost(X, y, theta):
m = y.size # number of training examples
J = 0
H = (X.dot(theta))
S = sum((H - y)**2);
J = S / (2*m);
return J
function J = computeCost(X, y, theta)
m = length(y);
J = 0;
% Hypothesis h(x)
h = X * theta;
% Error function (h(x) - y) ^ 2
squaredError = (h-y).^2;
% Cost function
J = sum(squaredError)/(2*m);
end
I think we needed to use iteration for much general solution for cost rather one iteration, also the result shows in the PDF 32.07 may not be correct answer that grader is looking for reason being its a one case out of many training data.
I think it should loop through like this
for i in 1:iteration
theta = theta - alpha*(1/m)(theta'*x-y)*x
j = (1/(2*m))(theta'*x-y)^2

Resources