How to run code in parallel? - f#

How can i run these two independent loops simultaneously in parallel.
let a1=Seq.map2 (fun a b->(1.0-a)/(a-b)) a2 a3
let b1=Seq.map2 (fun a b->a*b) b2 b3

You can use standard .NET tasks for this - there are no special F# functions or special syntax for spawning a computation in the background:
let a1Work = Task.Factory.StartNew(fun () ->
Array.map2 (fun a b->(1.0-a)/(a-b)) a2 a3)
let b1 = Array.map2 (fun a b->a*b) b2 b3
let a1 = a1Work.Value
I also changed your Seq.map2 to Array.map2 - computations on sequences are lazy and so running them inside a task would not actually do anything. With arrays, the whole calculation is completed immediately.

If this is a pattern you find yourself using regularly, it could be useful to write a generic function that runs two Async computations in parallel, for example like this:
module Async =
let Parallel2 (a: Async<'T>) (b: Async<'U>) : Async<'T * 'U> =
async {
let! res =
Async.Parallel [|
async { let! a = a in return box a }
async { let! b = b in return box b }
|]
return (res.[0] :?> 'T, res.[1] :?> 'U)
}
You can then use it like so:
async {
let! a1, b1 =
Async.Parallel2
(async { return Array.map2 (fun a b->(1.0-a)/(a-b)) a2 a3 })
(async { return Array.map2 (fun a b->a*b) b2 b3 })
// ...
}

try Array.zip and Array.Parallel.map?
Array.zip [|1;2;3|] [|2;3;4|] |> Array.Parallel.map (fun (a,b) -> a + b)
or simply use the ParallelSeq(see http://fsprojects.github.io/FSharp.Collections.ParallelSeq/ for more information)

Related

F# Cannot enumerate sequence generated by yield when using GetEnumerator

The following example is based on a snippet that produces functions that allow enumerating sequence values one by one.
Here printAreEqual () gives true, print2 () gives 12345678910, but print1 () gives 0000000000.
Why cannot the function returned by enumerate return the values of the sequence generated using yield?
open System.Linq
let enumerate (xs: seq<_>) =
use en = xs.GetEnumerator()
fun () ->
en.MoveNext() |> ignore
en.Current
let s1 = seq { for i in 1 .. 10 do yield i }
let s2 = seq { 1 .. 10 }
let f1 = s1 |> enumerate
let f2 = s2 |> enumerate
let printAreEqual () = Enumerable.SequenceEqual (s1, s2) |> printf "%b" // true
let print1 () = for i in 1 .. 10 do f1() |> printf "%i" // 0000000000
let print2 () = for i in 1 .. 10 do f2() |> printf "%i" // 12345678910
The use en = ... in the enumerate function is effectively doing this:
let enumerate (xs: seq<_>) =
let en = xs.GetEnumerator()
let f =
fun () ->
en.MoveNext() |> ignore
en.Current
en.Dispose()
f
You're always disposing of the enumerator before you start using it, so the behaviour is probably undefined in this situation and it doesn't matter why you get different results for two sequences with different implementations.
Fine-grained control of sequence enumeration is always tricky and it's hard to make helper functions for because of the mutable state.

Call async method in an inner lambda? "This construct may only be used within computation expressions"

I have the following code
let rec consume() : Async<unit> = async {
.....
listA
|> Seq.iter(fun i ->
.....
let listB : seq<...> option =
let c = getListB a b
match c with
| Some d -> Seq.filter(....) |> Some
| None -> None
match listB with .....
....
Now the function getListB is converted to return async<Seq<B>> instead of Seq<B>. So the code was converted to the following. However, the getListB blocked the execution. How to rewrite it nonblocking? Simply convert the line to let! c = getListB a b won't work because the code is in an inner lambda? The error message is "This construct may only be used within computation expressions".
let rec consume() : Async<unit> = async {
.....
listA
|> Seq.iter(fun i ->
.....
let listB : seq<...> option =
let c = getListB a b |> Async.RunSynchronously
match c with
| Some d -> Seq.filter(....) |> Some
| None -> None
I believe the problem you are describing boils down to how to convert an seq<Async> to an Async<seq>. This is described comprehensively in this post by Scott Wlaschin.
This is a poor man's implementation of the concepts described in his post which are far more powerful and generic. The general idea is that we want to delay the creation of the sequence until we have the values promised by the instance of Async<_>
let traverseSequence ( seqAsync : seq<Async<'a>>) =
let promiseOfAnEmptySequence = async { return Seq.empty }
let delayedCalculation (asyncHead : Async<'a>) (asyncTail : Async<seq<'a>>) =
async {
let! calculatedHead = asyncHead
return!
async {
let! calculatedTail = asyncTail
return calculatedHead |> Seq.singleton |> Seq.append(calculatedTail)
}
}
Seq.foldBack delayedCalculation seqAsync promiseOfAnEmptySequence
The answer depends on whether you want to run each element of the sequence sequentially or in parallel.
In both cases, start by using Seq.map instead of Seq.iter, then you can put another async block inside the lambda such that the result of the map is seq<Async<'a>>.
Sequential
For this, you need define some extra functions in an extra Async module.
module Async =
let map f x =
async{
let! x = x
return f x
}
let lift2 f x1 x2 =
async{
let! x1 = x1
let! x2 = x2
return f x1 x2
}
let return' x = async { return x }
let mapM mFunc sequ =
let consF x ys = lift2 (fun h t -> h::t) (mFunc x) ys
Seq.foldBack(consF) sequ (return' [])
|> map (Seq.ofList)
let sequence sequ = mapM id sequ
You might have seen mapM called traverse elsewhere, they are basically just different names for the same concept.
The sequence function is just a special case of mapM where the supplied binding function is just the identity (id) function. It has type seq<Async<'a>> -> Async<seq<'a>>, i.e. it flips the Async from being inside the Seq to being outside.
You then simply pipe the result of your Seq.map to the sequence function, which gives you an async value.
Your example code isn't complete so I made up some example code to use this:
let sleep = Async.Sleep 100
let sleeps = Seq.init 15 (fun _ -> sleep)
let sequencedSleeps = Async.sequence sleeps
Async.RunSynchronously sequencedSleeps
Real: 00:00:01.632, CPU: 00:00:00.000, GC gen0: 0, gen1: 0, gen2: 0
val it : seq<unit> =
[null; null; null; null; null; null; null; null; null; null; null; null;
null; null; null]
Parallel
To execute each element of the sequence in parallel, instead of sequentially, you could do:
let pSequence sequ = Async.Parallel sequ |> Async.map (Seq.ofArray)
Example test code:
let pSleeps = pSequence sleeps;;
Async.RunSynchronously pSleeps;;
Real: 00:00:00.104, CPU: 00:00:00.000, GC gen0: 0, gen1: 0, gen2: 0
val it : seq<unit> = seq [null; null; null; null; ...]
Note how the execution time depends on the chosen approach.
For the cases where you're getting back a seq<unit> and so want to ignore the result it can be useful to define some extra helper functions, such as:
let sequenceIgnore sequ = sequ |> Async.sequence |> Async.map (ignore)
let pSequenceIgnore sequ = sequ |> pSequence |> Async.map (ignore)
That lets you return a single unit rather than a superfluous sequence of them.

F sharp adding lists

how do you convert an obj list to int type. I am trying to add two lists using a map function below but it doesn't work on obj lists.
let query f=
seq{
let cmd = new OleDbCommand( "SELECT * FROM F" );
let conn = new OleDbConnection( #"Provider=Microsoft.ACE.OLEDB.12.0;
Data Source=D:\Users\df\Documents\Vfolio.accdb;
Persist Security Info=False;" )
conn.Open()
let DAdapt = new OleDbDataAdapter("SELECT * FROM F",conn)
let DTab = new DataSet()
let i= DAdapt.Fill(DTab)
let rowCol = DTab.Tables.[0].Rows
let rowCount = rowCol.Count
for i in 0 .. (rowCount - 1) do
yield f (rowCol.[i])
}
let u= query(fun row -> row.[0])
let a= List.ofSeq u
let v=query(fun row -> row.[1])
let b= List.ofSeq v
let c = List.map2 (fun x y-> x + y) a b
error msg: The type 'obj' does not support the operator '+'
Because row.[i] returns type obj, your u and v become seq<obj>, and thus your a and b become type List<obj>, and therefore x and y are inferred to have type obj, and of course, you can't add two objs, which is exactly what the compiler tells you.
If you are sure that row.[0] and row.[1] are numbers of some kind, you should apply the appropriate cast, for example:
let u= query(fun row -> row.[0] :?> int)
let a= List.ofSeq u
let v=query(fun row -> row.[1] :?> int)
let b= List.ofSeq v
let c = List.map2 (fun x y-> x + y) a b
You can apply this cast in other places, too, depending on your taste and requirements, for example:
let c = List.map2 (fun x y-> (x :?> int) + (y :?> int)) a b
Or:
let a= u |> Seq.cast<int> |> List.ofSeq
let b= v |> Seq.cast<int> |> List.ofSeq
But I like the first example best, because it applies the cast at the earliest known point and results in the least amount of extra code.
But beware: if row.[0] turns out to be not an int at runtime, you will get an InvalidCastException.
P.S. In your List.map2 call, you could specify (+) directly instead of wrapping it in an extra lambda:
List.map2 (+) a b
P.P.S Also, it seems that your List.ofSeq calls are wasteful, for Seq also has a map2:
let u = query(fun row -> row.[0] :?> int)
let v = query(fun row -> row.[1] :?> int)
let c = Seq.map2 (+) u v |> List.ofSeq
P.P.P.S Also, have you noticed that each of the two calls to query generates its own DB connection, command, adapter, and dataset? Did you intend this or did you mean to only have one connection and then fetch different columns from the result? If so, you should only call query once:
let c = query( fun row -> (row.[0] :?> int) + (row.[1] :?> int) ) |> List.ofSeq

Pass function as a parameter and overload it

I want to time my functions, some of them use up to three parameters. Right now I'm using the same code below with some variations for the three.
let GetTime f (args : string) =
let sw = Stopwatch.StartNew()
f (args)
printfn "%s : %A" sw.Elapsed
I want to replace the three functions with this one.
let GetTime f ( args : 'T[]) =
let sW = Stopwatch.StartNew()
match args.Length with
| 1 -> f args.[0]
| 2 -> f (args.[0] args.[1])
printfn "%A" sW.Elapsed
()
But I'm getting an error of type mismatch, if I use the three functions it works. Is it possible to send the function as a parameter and use it like this?
Why not just do something like this?
let getTime f =
let sw = Stopwatch.StartNew()
let result = f ()
printfn "%A" sw.Elapsed
result
Assuming that f1, f2, and f3 are three functions that take respectively 1, 2, and 3 arguments, you can use the getTime function like this:
getTime (fun () -> f1 "foo")
getTime (fun () -> f2 "foo" "bar")
getTime (fun () -> f3 "foo" "bar" "baz")
However, if you just need to time some functions in FSI, this feature is already built-in: just type
> #time;;
and timing will be turned on.
It isn't possible for the compiler to know how many arguments will be passed at runtime, so the function f must satisfy both 'T -> unit and 'T -> 'T -> unit. This form also requires all arguments to be of the same type.
The following approach delays the function execution and may be suitable for your needs.
let printTime f =
let sw = Stopwatch.StartNew()
f() |> ignore
printfn "%A" sw.Elapsed
let f1 s = String.length s
let f2 s c = String.concat c s
printTime (fun () -> f1 "Test")
printTime (fun () -> f2 [| "Test1"; "Test2" |] ",")
You're probably thinking of passing a method group as an argument to GetTime, and then having the compiler decide which overload of the method group to call. That's not possible with any .NET compiler. Method groups are used for code analysis by compilers and tools such as ReSharper, but they are not something that actually exists at runtime.
If your functions take their arguments in tupled form, like these:
let f1 (s: string, b: bool) =
System.Threading.Thread.Sleep 1000
s
let f2 (n: int, s:string, dt: System.DateTime) =
System.Threading.Thread.Sleep 1000
n+1
then the implementation becomes trivial:
let Timed f args =
let sw = System.Diagnostics.Stopwatch.StartNew()
let ret = f args
printfn "Called with arguments %A, elapsed %A" args sw.Elapsed
ret
Usage:
f1
|> Timed // note, at this time we haven't yet applied any arguments
<| ("foo", true)
|> printfn "f1 done, returned %A"
f2
|> Timed
<| (42, "bar", DateTime.Now)
|> printfn "f2 done, returned %A"
However, if the functions take their arguments in curried form, like this:
let f1Curried (s: string) (b: bool) =
System.Threading.Thread.Sleep 1000
s
let f2Curried (n: int) (s:string) (dt: System.DateTime) =
System.Threading.Thread.Sleep 1000
n+1
it becomes a bit tricky. The idea is using standard operators (<|), (<||), and (<|||) that are intended to uncurry the arguments.
let Timed2 op f args =
let sw = System.Diagnostics.Stopwatch.StartNew()
let ret = op f args
printfn "Called with arguments %A, elapsed %A" args sw.Elapsed
ret
f1Curried
|> Timed2 (<||) // again, no arguments are passed yet
<| ("foo", true)
|> printfn "f1Curried done, returned %A"
f2Curried
|> Timed2 (<|||)
<| (42, "bar", DateTime.Now)
|> printfn "f2Curried done, returned %A"

OCaml, F# successive, cascading let bindings

It is typical in OCaml or F# to have successive let bindings in the form:
let a1 = ...
let a2 = ...
let a3 = ...
let f1 = ...
let f2 = ...
let f3 = ...
f3 a1 a2 a3
In many cases some of these let bindings (e.g. f1 and f2 in the example above) are only used as building blocks of the expression or function immediately following them and not referenced again afterwards. In other cases some values are indeed used at the end of the "chain" (e.g. a1, a2 and a3 in the example above). Is there any syntactic idiom to make these differences in scope explicit?
On can use this to make clear that temp is used only in the definition of a1:
let a1 =
let temp = 42 in
temp + 2 in
let a2 = ...
The scope of temp is indeed restricted to the definition of a1.
Another template is reusing the same name to hide its previous use, thus also making it clear that the previous use is temporary:
let result = input_string inchan in
let result = parse result in
let result = eval result in
result
Reusing the same name is debatable, though.
Of course one always has comments and empty lines:
let a1 = ...
let a2 = ...
let a3 = ...
(*We now define f3:*)
let f1 = ...
let f2 = ...
let f3 = ...
f3 a1 a2 a3
Edit: as pointed out by fmr, I'm also fond of the pipe operator. It's not defined by default in OCaml, use
let (|>) x f = f x;;
Then you can write something like
input_string inchan |> parse |> eval |> print
In addition to jrouquie's answer, you can avoid giving names to intermediate values by judicious use of function composition and other combinators. I especially like the following three provided by Batteries:
# let ( |> ) x f = f x;;
val ( |> ) : 'a -> ('a -> 'b) -> 'b = <fun>
# let ( |- ) f g x = g (f x);;
val ( |- ) : ('a -> 'b) -> ('b -> 'c) -> 'a -> 'c = <fun>
# let flip f x y = f y x;;
val flip : ('a -> 'b -> 'c) -> 'b -> 'a -> 'c = <fun>
A small example using |> is
# [1;2;3]
|> List.map string_of_int
|> String.concat "; "
|> Printf.sprintf "[%s]";;
- : string = "[1; 2; 3]"
You'll end up needing |- and flip in more realistic examples. This is known as point-free or tacit programming.

Resources