F# arrays to 2D histogram

F# arrays to 2D histogram - f#

I’m using OxyPlot with F#. I have code to create a single parameter histogram and plot it. My code for dual parameter histograms in the form of a contour is too time consuming. I’d like an efficient way to map two vectors or arrays into a 2D histogram. I’m including my code for regular histogram.
let myHistogram c =
flatten dataArray.[c..c,*]
|> Seq.toArray
|> Array.map (fun x -> round(float(x)/16.0))
|> Seq.countBy (fun x -> x)
|> Seq.sort
|> Seq.map snd
So, I’m looking to take dataArray.[a…a,], dataArray[b…b,] and place them into bins of a specific resolution to create histogram[x,y]. OxyPlot needs the histogram in order to create a contour.
Imagine two arrays of data with one being called Alexa647-H and the other BV786-H. Each array contains 100,000 integers ranging between 0 and 10,000. You could plot these arrays as a dot plot in OxyPlot. That is straight forward, simply plot one array for the X-Axis and one array for the Y-Axis. I've included a plot below.
My question involves creating a contour plot out of the same data. For that, I need to first determine a resolution, say for convenience 100x100. Therefore I want to end up with a 2D array call hist2(100,100). The array is basically 10,000 bins of 1000x1000 in size. Each bin contains the count of elements which fall into a particular range -- a 2D histogram.
Dot and Contour
The coding example in OxyPlot generates a peak array mathematically. I want to generate that contour input peak array as outline above, instead.
var model = new PlotModel { Title = "ContourSeries" };
double x0 = -3.1;
double x1 = 3.1;
double y0 = -3;
double y1 = 3;
//generate values
Func<double, double, double> peaks = (x, y) => 3 * (1 - x) * (1 - x) * Math.Exp(-(x * x) - (y + 1) * (y + 1)) - 10 * (x / 5 - x * x * x - y * y * y * y * y) * Math.Exp(-x * x - y * y) - 1.0 / 3 * Math.Exp(-(x + 1) * (x + 1) - y * y);
var xx = ArrayBuilder.CreateVector(x0, x1, 100);
var yy = ArrayBuilder.CreateVector(y0, y1, 100);
var peaksData = ArrayBuilder.Evaluate(peaks, xx, yy);
var cs = new ContourSeries
{
Color = OxyColors.Black,
LabelBackground = OxyColors.White,
ColumnCoordinates = yy,
RowCoordinates = xx,
Data = peaksData
};
model.Series.Add(cs);
Plot generated by OxyPlot code
I hope this clears things up.
Don

Related

linear regression with one variable Gradient descent

I want to ask how this equation
can be written at octave by this way
predictions = X * theta;
delta = (1/m) * X' * (predictions - y);
theta = theta - alpha * delta;
I dont understand from where transpose come and how this equation converted to ve by this way?

The scalar product X.Y is mathematically sum (xi * yi) and can be written as X' * Y in octave when X and Y are vectors.
There are other ways to write a scalar product in octave, cf
https://octave.sourceforge.io/octave/function/dot.html

The question seems to be, given an example where:
X = randn(m, k); % m 'input' horizontal-vectors of dimensionality k
y = randn(m, n); % m 'target' horizontal-vectors of dimensionality n
theta = randn(k, n); % a (right) transformation from k to n dimensional
% horizontal-vectors
h = X * theta; % creates m rows of n-dimensional horizontal vectors
how is it that the following code
delta = zeros(k,n)
for j = 1 : k % iterating over all dimensions of the input
for l = 1 : n % iterating over all dimensions of the output
for i = 1 : m % iterating over all observations for that j,l pair
delta(j, l) += (1/m) * (h(i, l) - y(i, l)) * x(i,j);
end
theta(j, l) = theta(j, l) - alpha * delta(j, l);
end
end
can be vectorised as:
h = X * theta ;
delta = (1/ m) * X' * (h - y);
theta = theta - alpha * delta;
To confirm such a vectorised formulation makes sense, it always helps to note (e.g. below each line) the dimensions of the objects involved in the matrix / vectorised operations:
h = X * theta ;
% [m, n] [m, k] [k, n]
delta = (1/ m) * X' * (h - y);
% [k, n] [1, 1] [k, m] [m, n]
theta = theta - alpha * delta;
% [k, n] [k,n] [1, 1] [k, n]
Hopefully now it will become more obvious that they are equivalent.
W.r.t the X' * D calculation (where D = predictions - y) you can see that:
performing matrix multiplication with the 1st row of X' and the 1st column of D is equal to summing for k=1 and n=1 over all m observations, and placing that result at position [k=1, n=1] in the resulting matrix output. Then moving along the columns of D and still multiplying by the 1st row of X', you can see that we are simply moving along the n dimensions in D, and placing the result accordingly in the output. Similarly, moving along the rows of X', you move along the k dimensions of X', performing the same process for all n in that D, and placing the results accordingly, until you've finished matrix multiplications over all rows of X and columns in D.
If you follow the logic above, you will see that the summations involved are exactly the same as in the for loop formulation, but we managed to avoid using a for loop and use matrix operations instead.

How to calculate FFT of a time series in 3D space (X, Y, T)

A time series (x, y, t) in 3D space (X, Y, T) satisfies:
x(t) = f1(t), y(t) = f2(t),
where t = 1, 2, 3,....
In other words, coordinates (x, y) vary with timestamp t. It is easy to compute the FFT of x(t) or y(t), but how do you calculate the FFT of (x, y)? I assume it should NOT be computed as a 2D-FFT, because that is for an image, whereas (x, y) is just a series. Any suggestion? Thank you.

use
fftn
for example: Y = fftn(X) returns the multidimensional Fourier transform of an N-D array using a fast Fourier transform algorithm. The N-D transform is equivalent to computing the 1-D transform along each dimension of X. The output Y is the same size as X.
for 3-D transform:
Create a 3-D signal X. The size of X is 20-by-20-by-20
x = (1:20)';
y = 1:20;
z = reshape(1:20,[1 1 20]);
X = cos(2*pi*0.01*x) + sin(2*pi*0.02*y) + cos(2*pi*0.03*z);
Compute the 3-D Fourier transform of the signal, which is also a 20-by-20-by-20 array.
Y = fftn(X)
Pad X with zeros to compute a 32-by-32-by-32 transform.
m = nextpow2(20);
Y = fftn(X,[2^m 2^m 2^m]);
size(Y)
also you can use this code:
first You might use SINGLE intead of DOUBLE
psi = single(psi);
fftpsi = fft(psi,[],3);
Next might be working slide by slide
psi=rand(10,10,10);
% costly way
fftpsi=fftn(psi);
% This might save you some RAM, to be tested
[m,n,p] = size(psi);
for k=1:p
psi(:,:,k) = fftn(psi(:,:,k));
end
psi = reshape(psi,[m*n p]);
for i=1:m*n % you might work on bigger row-block to increase speed
psi(i,:) = fft(psi(i,:));
end
psi = reshape(psi,[m n p]);
% Check
norm(psi(:)-fftpsi(:))
I hope it will be useful for you

Vectorization issue

Say you have two column vectors vv and ww, each with 7 elements (i.e., they have dimensions 7x1). Consider the following code:
z = 0;
for i = 1:7
z = z + v(i) * w(i)
end
A) z = sum (v .* w);
B) z = w' * v;
C) z = v * w;
D) z = w * v;
According to the solutions, answers (A) AND (B) are the right answers, can someone please help me understand why?
Why is z = v * w' which is similar to answer (B) but only the order of the operation changes, is false? Since we want a vector that by definition only has one column, wouldn't we need a matrix of this size: 1x7 * 7x1 = 1x1 ? So why is z = v' * w false ? It gives the same dimension as answer (B)?

z = v'*w is true and is equal to w'*v.
They both makes 1*1 matrix, which is a number value in octave.
See this:
octave:5> v = rand(7, 1);
octave:6> w = rand(7, 1);
octave:7> v'*w
ans = 1.3110
octave:8> w'*v
ans = 1.3110
octave:9> sum(v.*w)
ans = 1.3110

Answers A and B both perform a dot product of the two vectors, which yields the same result as the code provided. Answer A first performs the element-wise product (.*) of the two column vectors, then sums those intermediate values. Answer B performs the same mathematical operation but does so via a dot product (i.e., matrix multiplication).
Answer C is incorrect because it would be performing a matrix multiplication on misaligned matrices (7x1 and 7x1). The same is true for D.
z = v * w', which was not one of the options, is incorrect because it would yield a 7x7 matrix (instead of the 1x1 scalar value desired). The point is that order matters when performing matrix multiplication. (1xN)X(Nx1) -> (1x1), whereas (Nx1)X(1xN) -> (NxN).
z = v' * w is actually a correct solution but was simply not provided as one of the options.

FORTRAN counter loop returns multiple iterations of the same value

First of all I am a complete novice to FORTRAN. With that said I am attempting to "build" a box, then randomly generate x, y, z coordinates for 100 atoms. From there, the goal is to calculate the distance between each atom, which becomes the value "r" of the Lennard-Jones potential energy equation. Then calculate the LJ potential, and finally sum the potential of the entire box. A previous question that I had asked about this project is here. The problem is that I get the same calculated value over and over and over again. My code is below.
program energytot
implicit none
integer, parameter :: n = 100
integer :: i, j, k, seed(12)
double precision :: sigma, r, epsilon, lx, ly, lz
double precision, dimension(n) :: x, y, z, cx, cy, cz
double precision, dimension(n*(n+1)/2) :: dx, dy, dz, LJx, LJy, LJz
sigma = 4.1
epsilon = 1.7
!Box length with respect to the axis
lx = 15
ly = 15
lz = 15
do i=1,12
seed(i)=i+3
end do
!generate n random numbers for x, y, z
call RANDOM_SEED(PUT = seed)
call random_number(x)
call random_number(y)
call random_number(z)
!convert random numbers into x, y, z coordinates
cx = ((2*x)-1)*(lx*0.5)
cy = ((2*y)-1)*(lx*0.5)
cz = ((2*z)-1)*(lz*0.5)
do j=1,n-1
do k=j+1,n
dx = ABS((cx(j) - cx(k)))
LJx = 4 * epsilon * ((sigma/dx(j))**12 - (sigma/dx(j))**6)
dy = ABS((cy(j) - cy(k)))
LJy = 4 * epsilon * ((sigma/dy(j))**12 - (sigma/dy(j))**6)
dz = ABS((cz(j) - cz(k)))
LJz = 4 * epsilon * ((sigma/dz(j))**12 - (sigma/dz(j))**6)
end do
end do
print*, dx
end program energytot

What exactly is your question? What do you want your code to do, and what does it do instead?
If you're having problems with the final print statement print*, dx, try this instead:
print *, 'dx = '
do i = 1, n * (n + 1) / 2
print *, dx(i)
end do
It seems that dx is too big to be printed without a loop.
Also, it looks like you're repeatedly assigning the array dx (and other arrays in the loop) to a single value. Try this instead:
i = 0
do j=1,n-1
do k=j+1,n
i = i + 1
dx(i) = ABS((cx(j) - cx(k)))
end do
end do
This way, each value cx(j) - cx(k) gets saved to a different element of dx, instead of overwriting previously saved values.

My new code goes something like this:
program energytot
implicit none
integer, parameter :: n = 6
integer :: i, j, k, seed(12)
double precision :: sigma, r, epsilon, lx, ly, lz, etot, pot, rx, ry, rz
double precision, dimension(n) :: x, y, z, cx, cy, cz
sigma = 4.1
epsilon = 1.7
etot=0
!Box length with respect to the axis
lx = 15
ly = 15
lz = 15
do i=1,12
seed(i)=i+90
end do
!generate n random numbers for x, y, z
call RANDOM_SEED(PUT = seed)
call random_number(x)
call random_number(y)
call random_number(z)
!convert random numbers into x, y, z coordinates
cx = ((2*x)-1)*(lx*0.5)
cy = ((2*y)-1)*(lx*0.5)
cz = ((2*z)-1)*(lz*0.5)
do j=1,n-1
do k=j+1,n
rx = (cx(j) - cx(k))
ry = (cy(j) - cy(k))
rz = (cz(j) - cz(k))
!Apply minimum image convention
rx=rx-lx*anint(rx/lx)
ry=ry-ly*anint(ry/ly)
rz=rz-lz*anint(rz/lz)
r=sqrt(rx**2+ry**2+rz**2)
pot=4 * epsilon * ((sigma/r)**12 - (sigma/r)**6)
print*,pot
etot=etot+pot
end do
end do
print*, etot
end program energytot

OpenCV 2d line intersection helper function

I was looking for a helper function to calculate the intersection of two lines in OpenCV. I have searched the API Documentation, but couldn't find a useful resource.
Are there basic geometric helper functions for intersection/distance calculations on lines/line segments in OpenCV?

There are no function in OpenCV API to calculate lines intersection, but distance is:
cv::Point2f start, end;
double length = cv::norm(end - start);
If you need a piece of code to calculate line intersections then here it is:
// Finds the intersection of two lines, or returns false.
// The lines are defined by (o1, p1) and (o2, p2).
bool intersection(Point2f o1, Point2f p1, Point2f o2, Point2f p2,
Point2f &r)
{
Point2f x = o2 - o1;
Point2f d1 = p1 - o1;
Point2f d2 = p2 - o2;
float cross = d1.x*d2.y - d1.y*d2.x;
if (abs(cross) < /*EPS*/1e-8)
return false;
double t1 = (x.x * d2.y - x.y * d2.x)/cross;
r = o1 + d1 * t1;
return true;
}

There's one cool trick in 2D geometry which I find to be very useful to calculate lines intersection. In order to use this trick we represent each 2D point and each 2D line in homogeneous 3D coordinates.
At first let's talk about 2D points:
Each 2D point (x, y) corresponds to a 3D line that passes through points (0, 0, 0) and (x, y, 1).
So (x, y, 1) and (α•x, α•y, α) and (β•x, β•y, β) correspond to the same point (x, y) in 2D space.
Here's formula to convert 2D point into homogeneous coordinates: (x, y) -> (x, y, 1)
Here's formula to convert homogeneous coordinates into 2D point: (x, y, ω) -> (x / ω, y / ω). If ω is zero that means "point at infinity". It doesn't correspond to any point in 2D space.
In OpenCV you may use convertPointsToHomogeneous() and convertPointsFromHomogeneous()
Now let's talk about 2D lines:
Each 2D line can be represented with three coordinates (a, b, c) which corresponds to 2D line equation: a•x + b•y + c = 0
So (a, b, c) and (ω•a, ω•b, ω•c) correspond to the same 2D line.
Also, (a, b, c) corresponds to (nx, ny, d) where (nx, ny) is unit length normal vector and d is distance from the line to (0, 0)
Also, (nx, ny, d) is (cos φ, sin φ, ρ) where (φ, ρ) are polar coordinates of the line.
There're two interesting formulas that link together points and lines:
Cross product of two distinct points in homogeneous coordinates gives homogeneous line coordinates: (α•x₁, α•y₁, α) ✕ (β•x₂, β•y₂, β) = (a, b, c)
Cross product of two distinct lines in homogeneous coordinates gives homogeneous coordinate of their intersection point: (a₁, b₁, c₁) ✕ (a₂, b₂, c₂) = (x, y, ω). If ω is zero that means lines are parallel (have no single intersection point in Euclidean geometry).
In OpenCV you may use either Mat::cross() or numpy.cross() to get cross product
If you're still here, you've got all you need to find lines given two points and intersection point given two lines.

An algorithm for finding line intersection is described very well in the post How do you detect where two line segments intersect?
The following is my openCV c++ implementation. It uses the same notation as in above post
bool getIntersectionPoint(Point a1, Point a2, Point b1, Point b2, Point & intPnt){
Point p = a1;
Point q = b1;
Point r(a2-a1);
Point s(b2-b1);
if(cross(r,s) == 0) {return false;}
double t = cross(q-p,s)/cross(r,s);
intPnt = p + t*r;
return true;
}
double cross(Point v1,Point v2){
return v1.x*v2.y - v1.y*v2.x;
}

Here is my implementation for EmguCV (C#).
static PointF GetIntersection(LineSegment2D line1, LineSegment2D line2)
{
double a1 = (line1.P1.Y - line1.P2.Y) / (double)(line1.P1.X - line1.P2.X);
double b1 = line1.P1.Y - a1 * line1.P1.X;
double a2 = (line2.P1.Y - line2.P2.Y) / (double)(line2.P1.X - line2.P2.X);
double b2 = line2.P1.Y - a2 * line2.P1.X;
if (Math.Abs(a1 - a2) < double.Epsilon)
throw new InvalidOperationException();
double x = (b2 - b1) / (a1 - a2);
double y = a1 * x + b1;
return new PointF((float)x, (float)y);
}

Using homogeneous coordinates makes your life easier:
cv::Mat intersectionPoint(const cv::Mat& line1, const cv::Mat& line2)
{
// Assume we receive lines as l=(a,b,c)^T
assert(line1.rows == 3 && line1.cols = 1
&& line2.rows == 3 && line2.cols == 1);
// Point is p=(x,y,w)^T
cv::Mat point = line1.cross(line2);
// Normalize so it is p'=(x',y',1)^T
if( point.at<double>(2,0) != 0)
point = point * (1.0/point.at<double>(2,0));
}
Note that if the third coordinate is 0 the lines are parallel and there is not solution in R² but in P^2, and then the point means a direction in 2D.

my implementation in Python (using numpy array)
with line1 = [[x1, y1],[x2, y2]] & line2 = [[x1, y1],[x2, y2]]
def getIntersection(line1, line2):
s1 = numpy.array(line1[0])
e1 = numpy.array(line1[1])
s2 = numpy.array(line2[0])
e2 = numpy.array(line2[1])
a1 = (s1[1] - e1[1]) / (s1[0] - e1[0])
b1 = s1[1] - (a1 * s1[0])
a2 = (s2[1] - e2[1]) / (s2[0] - e2[0])
b2 = s2[1] - (a2 * s2[0])
if abs(a1 - a2) < sys.float_info.epsilon:
return False
x = (b2 - b1) / (a1 - a2)
y = a1 * x + b1
return (x, y)

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

F# arrays to 2D histogram - f#

Related

linear regression with one variable Gradient descent

How to calculate FFT of a time series in 3D space (X, Y, T)

Vectorization issue

FORTRAN counter loop returns multiple iterations of the same value

OpenCV 2d line intersection helper function

Categories

Resources