I want to compute the maximum value of a two-dimensional vector of f64s in Rust.
What is the easiest and most performant way to do accomplish this?
My vector is declared as follows:
let (width,height) = (1920,1080);
let mut flow = vec![vec![0.0; height as usize]; width as usize];
My intuitive solution based on this answer would be the following: (Playground Link)
let max: &f64 = flow.iter()
.map(|f| f
.iter()
.fold(f64::NEG_INFINITY, |prev, curr| prev.max(*curr)))
.fold(f64::NEG_INFINITY, |prev, curr| prev.max(*curr));
However, that code does not compile and only works by converting the iterator of the inner vector into a reference: (Playground Link)
let max: &f64 = &(flow.iter()
.map(|f| f
.iter()
.fold(f64::NEG_INFINITY, |prev, curr| prev.max(*curr))))
.fold(f64::NEG_INFINITY, |prev, curr| prev.max(curr));
Why do I need to convert the inner iterator into a reference, and is there a better way to find the maximum of a two-dimensional f64 vector?
I'm not sure how fast this solution is, but it's clearer:
let max = flow
.iter()
.flatten()
.max_by(|a, b| a.partial_cmp(b).unwrap());
The right way to write your loops would be something like this:
let max: f64 = flow.iter()
.map(|f| f
.iter()
.fold(f64::NEG_INFINITY, |prev, curr| prev.max(*curr)))
.fold(f64::NEG_INFINITY, |prev, curr| prev.max(curr));
To understand it, it helps to annotate the type of each intermediate value and variables:
let max: f64 = flow // Vec<Vec<f64>>
.iter() // Iterator<Item=&Vec<f64>>
.map(|f| f // f: &Vec<f64>
.iter() // Iterator<Item=&f64>
.fold(
f64::NEG_INFINITY, // f64
|prev, curr| // prev: f64, curr: &f64 -> f64
prev.max(*curr)
) // return of fold(): f64
) // return of map(): Iterator<Item=f64>
.fold(
f64::NEG_INFINITY, // f64
|prev, curr| // prev: f64, curr: f64 -> f64
prev.max(curr)
) // return of fold(): f64
;
Note the signature of Iterator::fold:
fn fold<B, F>(self, init: B, f: F) -> B where
F: FnMut(B, Self::Item) -> B,
This means that it returns the type of its first argument, in your case the first argument is always f64::NEG_INFINITY.
And in the closure, the first argument is of that same type, while the second is the type of the Item from the iterator where fold is used. In the first fold that Item is &f64 but in the second one it is f64.
That is why in the first closure curr: &f64 while in the second one curr: f64, that I think was the most confusing point.
That said, I would write instead:
let max = flow
.iter()
.flatten()
.fold(f64::NEG_INFINITY, |prev, curr| prev.max(*curr));
Iterator::flatten is a function that applied to an iterator that returns something that implements IntoIterator<Item=T> returns a value that implements Iterator<Item=T> and that iterates over the whole two-dimensional thing. Exactly what you need, I think.
If you want to get rid of the reference in curr, useful if your code in fold is more complex, you can also add copied():
let max = flow
.iter()
.flatten()
.copied()
.fold(f64::NEG_INFINITY, |prev, curr| prev.max(curr));
Iterator::copied just converts an Iterator<Item=&T> into an Iterator<Item=T>, as long as T: Copy.
Related
I am trying to return a pair of sums using the let construct in sml. Every way I have tried will only return one value. I have tried creating a list by using cons (::) and then returning the list, but that gives an error as well.
val t = [(3,4), (4,5), (5,6)];
fun sumPairs(nil) = 0
| sumPairs((x,y)::zs) =
let
val sumFirst = x + sumPairs(zs)
val sumSecond = y + sumPairs(zs)
in
(sumFirst, sumSecond) <how would I return this as a tuple or list?>
end;
sumPairs(t);
The problem is not with (sumFirst, sumSecond) or with let specifically, but with the rest of your code.
The base case and the recursions say that sumPairs produces an int, not a pair of ints.
Because of this, there is a conflict when you try produce a pair.
Your base case should be (0,0), not 0, since it must be a pair.
You also need to deconstruct the result from the recursion since that produces a pair, not an integer.
Like this
fun sumPairs nil = (0, 0)
| sumPairs ((x,y)::zs) =
let
val (sumFirst, sumSecond) = sumPairs zs
in
(x + sumFirst, y + sumSecond)
end;
Hello experienced pythoners.
The goal is simply to read in my own files which have the following format, and to then apply mathematical operations to these values and polynomials. The files have the following format:
m1:=10:
m2:=30:
Z1:=1:
Z2:=-1:
...
Some very similar variables, next come the laguerre polynomials
...
F:= (12.58295)*L(0,x)*L(1,y)*L(6,z) + (30.19372)*L(0,x)*L(2,y)*L(2,z) - ...:
Where L stands for a laguerre polynomial and takes two arguments.
I have written a procedure in Python which splits apart each line into a left and right hand side split using the "=" character as a divider. The format of these files is always the same, but the number of laguerre polynomials in F can vary.
import re
linestring = open("file.txt", "r").read()
linestring = re.sub("\n\n","\n",str(linestring))
linestring = re.sub(",\n",",",linestring)
linestring = re.sub("\\+\n","+",linestring)
linestring = re.sub(":=\n",":=",linestring)
linestring = re.sub(":\n","\n",linestring)
linestring = re.sub(":","",linestring)
LINES = linestring.split("\n")
for LINE in LINES:
LINE = re.sub(" ","",LINE)
print "LINE=", LINE
if len(LINE) <=0:
next
PAIR = LINE.split("=")
print "PAIR=", PAIR
LHS = PAIR[0]
RHS = PAIR[1]
print "LHS=", LHS
print "RHS=", RHS
The first re.sub block just deals with formatting the file and discarding characters that python will not be able to process; then a loop is performed to print 4 things, LINE, PAIR, LHS and RHS, and it does this nicely. using the example file from above the procedure will print the following:
LINE= m1=1
PAIR= ['m1', '1']
LHS= m1
RHS= 1
LINE= m2=1
PAIR= ['m2', '1']
LHS= m2
RHS= 1
LINE= Z1=-1
PAIR= ['Z1', '-1']
LHS= Z1
RHS= -1
LINE= Z2=-1
PAIR= ['Z2', '-1']
LHS= Z2
RHS= -1
LINE= F= 12.5*L(0,x)L(1,y) + 30*L(0,x)L(2,y)L(2,z)
PAIR=['F', '12.5*L(0,x)L(1,y) + 30*L(0,x)L(2,y)L(2,z)']
LHS= F
RHS= 12.5*L(0,x)L(1,y) + 30*L(0,x)L(2,y)L(2,z)
My question is what is the next best step to process this output and use it in a mathematical script, especially assigning the L to mean a laguerre polynomial? I tried putting the LHS and RHS into a dictionary, but found it troublesome to put F in it due to the laguerre polynomials.
Any ideas are welcome. Perhaps I am overcomplicating this and there is a much simpler way to parse this file.
Many thanks in advance
Your parsing algorithm doesn't seem to work correctly, as the RHS of your variables dont produce the expected result.
Also the first re.sub block where you want to format the file seems overly complicated. Assuming every statement in your input file is terminated by a colon, you could get rid of all whitespace and newlines and seperate the statements using the following code:
linestring = open('file.txt','r').read()
strippedstring = linestring.replace('\n','').replace(' ','')
statements = re.split(':(?!=)',strippedstring)[:-1]
Then you iterate over the statements and split each one in LHS and RHS:
for st in statements:
lhs,rhs = re.split(':=',st)
print 'lhs=',lhs
print 'rhs=',rhs
In the next step, try to distinguish normal float variables and polynomials:
#evaluate rhs
try:
#interpret as numeric constant
f = float(rhs)
print " ",f
except ValueError:
#interpret as laguerre-polynomial
summands = re.split('\+', re.sub('-','+-',rhs))
for s in summands:
m = re.match("^(?P<factor>-?[0-9]*(\.[0-9]*)?)(?P<poly>(\*?L\([0-9]+,[a-z]\))*)", s)
if not m:
print ' polynomial misformatted'
continue
f = m.group('factor')
print ' factor: ',f
p = m.group('poly')
for l in re.finditer("L\((?P<a>[0-9]+),(?P<b>[a-z])\)",p):
print ' poly: L(%s,%s)' % (l.group("a"),l.group("b"))
This should work for your given example file.
type Point<'t> =
val X : 't
val Y : 't
new(x : 't,y : 't) = { X = x; Y = y }
let clampedSubtract (p1:Point<_>) (p2:Point<_>) =
Point( max (p2.X - p1.X) 0, max (p2.Y - p1.Y) 0 )
If you look at the code above, you will notice, that the function is not implemented as generic as it should be.
First, using the 0 in the max expressions clamps the type to int. But it should be the type of whatever type Point<'t> has and not Point<int>.
But even more important, this function can only work as expected, if signed types are used for `t.
This raises a few questions of mine:
Is there a way to obtain the neutral element (zero) from a generic (number) type?
How can I express a restriction such as "only signed number"?
Is there a way to extend type constraint system in F#?
Thanks, in advance.
The solution to the first question as already answered is to use an inline function together with GenericZero and that's all.
Regarding the signed restriction, actually there's an easy way to restrict it to signed types. Use somewhere the generic unary negation which is defined only for signed types:
let inline clampedSubtract (p1:Point<_>) (p2:Point<_>) =
let zero = LanguagePrimitives.GenericZero
Point( max (p2.X + -p1.X) zero, max (p2.Y + -p1.Y) zero )
let result1 = clampedSubtract (Point(4 , 5 )) (Point(4 , 5 ))
let result2 = clampedSubtract (Point(4y , 5y )) (Point(4y , 5y ))
let result3 = clampedSubtract (Point(4uy, 5uy)) (Point(4uy, 5uy)) // doesn't compile
In general, if you want to restrict any generic function to signed types you can define this function:
let inline whenSigned x = ignore (-x)
let inline clampedSubtract (p1:Point<_>) (p2:Point<_>) =
whenSigned p1.X
let zero = LanguagePrimitives.GenericZero
Point( max (p2.X - p1.X) zero, max (p2.Y - p1.Y) zero )
Finally regarding your third question it's not very clear to me what do you mean with extending the type system. You can create static constraints by yourself, in that sense the system is already extensible.
I did a project sometime ago to emulate some Haskell types, part of the code of that project is still in a module in FsControl there you can have an idea to what level you can play with those constraints.
This makes it generic:
let inline clampedSubtract (p1:Point<_>) (p2:Point<_>) =
let zero = LanguagePrimitives.GenericZero
Point( max (p2.X - p1.X) zero, max (p2.Y - p1.Y) zero )
But there's no way to constrain it to signed primitive types.
In our fsharp code autogenerated gethashcode implementation shows very bad performance and big collisions rate. Is it a problem in fsharp implementation of gethashcode generator or just an edge case?
open System
open System.Collections.Generic
let check keys e name =
let dict = new Dictionary<_,_>(Array.length keys, e)//, HashIdentity.Structural)
let stopWatch = System.Diagnostics.Stopwatch.StartNew()
let add k = dict.Add(k, 1.02)
Array.iter add keys
stopWatch.Stop()
let hsahes = new HashSet<int>()
let add_hash x = hsahes.Add(e.GetHashCode(x)) |> not
let collisions = Array.filter add_hash keys |> Array.length
printfn "%s %f sec %f collisions" name stopWatch.Elapsed.TotalSeconds (double(collisions) / double(keys.Length))
type StructTuple<'T,'T2> =
struct
val fst: 'T
val snd : 'T2
new(fst: 'T, snd : 'T2) = {fst = fst; snd = snd}
end
let bad_keys = seq{
let rnd = new Random();
while true do
let j = uint32(rnd.Next(0, 3346862))
let k = uint16 (rnd.Next(0, 658))
yield StructTuple(j,k)
}
let good_keys = seq{
for k in 0us..658us do
for j in 0u.. 3346862u do
yield StructTuple(j,k)
}
module CmpHelpers =
let inline combine (h1:int) (h2:int) = (h1 <<< 5) + h1 ^^^ h2;
type StructTupleComparer<'T,'T2>() =
let cmparer = EqualityComparer<Object>.Default
interface IEqualityComparer<StructTuple<'T,'T2>> with
member this.Equals (a,b) = cmparer.Equals(a.fst, b.fst) && cmparer.Equals(a.snd, b.snd)
member this.GetHashCode (x) = CmpHelpers.combine (cmparer.GetHashCode(x.fst)) (cmparer.GetHashCode(x.snd))
type AutoGeneratedStructTupleComparer<'T,'T2>() =
let cmparer = LanguagePrimitives.GenericEqualityComparer
interface IEqualityComparer<StructTuple<'T,'T2>> with
member this.Equals (a:StructTuple<'T,'T2>,b:StructTuple<'T,'T2>) =
LanguagePrimitives.HashCompare.GenericEqualityERIntrinsic<'T> a.fst b.fst
&& LanguagePrimitives.HashCompare.GenericEqualityERIntrinsic<'T2> a.snd b.snd
member this.GetHashCode (x:StructTuple<'T,'T2>) =
let mutable num = 0
num <- -1640531527 + (LanguagePrimitives.HashCompare.GenericHashWithComparerIntrinsic<'T2> cmparer x.snd + ((num <<< 6) + (num >>> 2)))
-1640531527 + (LanguagePrimitives.HashCompare.GenericHashWithComparerIntrinsic<'T> cmparer x.fst + ((num <<< 6) + (num >>> 2)));
let uniq (sq:seq<'a>) = Array.ofSeq (new HashSet<_>(sq))
[<EntryPoint>]
let main argv =
let count = 15000000
let keys = good_keys |> Seq.take count |> uniq
printfn "good keys"
check keys (new StructTupleComparer<_,_>()) "struct custom"
check keys HashIdentity.Structural "struct auto"
check keys (new AutoGeneratedStructTupleComparer<_,_>()) "struct auto explicit"
let keys = bad_keys |> Seq.take count |> uniq
printfn "bad keys"
check keys (new StructTupleComparer<_,_>()) "struct custom"
check keys HashIdentity.Structural "struct auto"
check keys (new AutoGeneratedStructTupleComparer<_,_>()) "struct auto explicit"
Console.ReadLine() |> ignore
0 // return an integer exit code
output
good keys
struct custom 1.506934 sec 0.000000 collisions
struct auto 4.832881 sec 0.776863 collisions
struct auto explicit 3.166931 sec 0.776863 collisions
bad keys
struct custom 3.631251 sec 0.061893 collisions
struct auto 10.340693 sec 0.777034 collisions
struct auto explicit 8.893612 sec 0.777034 collisions
I am no expert on the overall algorithm used to produce auto-generated Equals and GetHashCode, but it just seems to produce something non-optimal here. I don't know offhand if that is normal for a general-purpose auto-generated implementation, or if there are practical ways of auto-generating close-to-optimal implementations reliably.
It's worth noting that if you just use the standard tuple, the autogenerated hashing and comparison give the same collision rate and equal performance as your custom implementation. And using the latest F# 4.0 bits (where there has recently been a significant perf improvement in this area), the autogenerated stuff becomes significantly faster than the custom implementation.
My numbers:
// F# 3.1, struct tuples
good keys
custom 0.951254 sec 0.000000 collisions
auto 2.737166 sec 0.776863 collisions
bad keys
custom 2.923103 sec 0.061869 collisions
auto 7.706678 sec 0.777040 collisions
// F# 3.1, standard tuples
good keys
custom 0.995701 sec 0.000000 collisions
auto 0.965949 sec 0.000000 collisions
bad keys
custom 3.091821 sec 0.061869 collisions
auto 2.924721 sec 0.061869 collisions
// F# 4.0, standard tuples
good keys
custom 1.018672 sec 0.000000 collisions
auto 0.619066 sec 0.000000 collisions
bad keys
custom 3.082988 sec 0.061869 collisions
auto 1.829720 sec 0.061869 collisions
Opened issue in fsharp issue tracker. Accepted as a bug https://github.com/fsharp/fsharp/issues/343
I'm trying to experiment with software defined radio concepts. From this article I've tried to implement a GPU-parallelism Discrete Fourier Transform.
I'm pretty sure I could pre-calculate 90 degrees of the sin(i) cos(i) and then just flip and repeat rather than what I'm doing in this code and that that would speed it up. But so far, I don't even think I'm getting correct answers. An all-zeros input gives a 0 result as I'd expect, but all 0.5 as inputs gives 78.9985886f (I'd expect a 0 result in this case too). Basically, I'm just generally confused. I don't have any good input data and I don't know what to do with the result or how to verify it.
This question is related to my other post here
open Microsoft.ParallelArrays
open System
// X64MulticoreTarget is faster on my machine, unexpectedly
let target = new DX9Target() // new X64MulticoreTarget()
ignore(target.ToArray1D(new FloatParallelArray([| 0.0f |]))) // Dummy operation to warm up the GPU
let stopwatch = new System.Diagnostics.Stopwatch() // For benchmarking
let Hz = 50.0f
let fStep = (2.0f * float32(Math.PI)) / Hz
let shift = 0.0f // offset, once we have to adjust for the last batch of samples of a stream
// If I knew that the periodic function is periodic
// at whole-number intervals, I think I could keep
// shift within a smaller range to support streams
// without overflowing shift - but I haven't
// figured that out
//let elements = 8192 // maximum for a 1D array - makes sense as 2^13
//let elements = 7240 // maximum on my machine for a 2D array, but why?
let elements = 7240
// need good data!!
let buffer : float32[,] = Array2D.init<float32> elements elements (fun i j -> 0.5f) //(float32(i * elements) + float32(j)))
let input = new FloatParallelArray(buffer)
let seqN : float32[,] = Array2D.init<float32> elements elements (fun i j -> (float32(i * elements) + float32(j)))
let steps = new FloatParallelArray(seqN)
let shiftedSteps = ParallelArrays.Add(shift, steps)
let increments = ParallelArrays.Multiply(fStep, steps)
let cos_i = ParallelArrays.Cos(increments) // Real component series
let sin_i = ParallelArrays.Sin(increments) // Imaginary component series
stopwatch.Start()
// From the documentation, I think ParallelArrays.Multiply does standard element by
// element multiplication, not matrix multiplication
// Then we sum each element for each complex component (I don't understand the relationship
// of this, or the importance of the generalization to complex numbers)
let real = target.ToArray1D(ParallelArrays.Sum(ParallelArrays.Multiply(input, cos_i))).[0]
let imag = target.ToArray1D(ParallelArrays.Sum(ParallelArrays.Multiply(input, sin_i))).[0]
printf "%A in " ((real * real) + (imag * imag)) // sum the squares for the presence of the frequency
stopwatch.Stop()
printfn "%A" stopwatch.ElapsedMilliseconds
ignore (System.Console.ReadKey())
I share your surprise that your answer is not closer to zero. I'd suggest writing naive code to perform your DFT in F# and seeing if you can track down the source of the discrepancy.
Here's what I think you're trying to do:
let N = 7240
let F = 1.0f/50.0f
let pi = single System.Math.PI
let signal = [| for i in 1 .. N*N -> 0.5f |]
let real =
seq { for i in 0 .. N*N-1 -> signal.[i] * (cos (2.0f * pi * F * (single i))) }
|> Seq.sum
let img =
seq { for i in 0 .. N*N-1 -> signal.[i] * (sin (2.0f * pi * F * (single i))) }
|> Seq.sum
let power = real*real + img*img
Hopefully you can use this naive code to get a better intuition for how the accelerator code ought to behave, which could guide you in your testing of the accelerator code. Keep in mind that part of the reason for the discrepancy may simply be the precision of the calculations - there are ~52 million elements in your arrays, so accumulating a total error of 79 may not actually be too bad. FWIW, I get a power of ~0.05 when running the above single precision code, but a power of ~4e-18 when using equivalent code with double precision numbers.
Two suggestions:
ensure you're not somehow confusing degrees with radians
try doing it sans-parallelism, or just with F#'s asyncs for parallelism
(In F#, if you have an array of floats
let a : float[] = ...
then you can 'add a step to all of them in parallel' to produce a new array with
let aShift = a |> (fun x -> async { return x + shift })
|> Async.Parallel |> Async.RunSynchronously
(though I expect this might be slower that just doing a synchronous loop).)