F# How to preserve lazy /copy a unforced lazy statement - f#

The output from below is 15,9,9 however I want 15,9,21
I want to preserve a lazy version so I can put in a new function version in a composed function.
open System
let mutable add2 = fun x-> x+2
let mutable mult3 = fun x-> x*3
let mutable co = add2 >> mult3
let mutable com = lazy( add2 >> mult3)
let mutable com2 = com
add2<- fun x-> x
co 3|> printfn "%A"
com.Force() 3|> printfn "%A"
add2<- fun x-> x+4
com2.Force() 3|> printfn "%A"

I don't think you need lazy values here - lazy value is evaluated once when needed, but its value does not change afterwards. In your case, you need Force to re-evaluate the value in case some dependencies have changed. You can define something like this:
type Delayed<'T> =
| Delayed of (unit -> 'T)
member x.Force() = let (Delayed f) = x in f()
let delay f = Delayed f
This represents a delayed value (really, just a function) with Force method that will evaluate it each time it is accessed. If you rewrite your code using delay, it behaves as you wanted:
let mutable add2 = fun x-> x+2
let mutable mult3 = fun x-> x*3
let mutable com = delay(fun () -> add2 >> mult3)
let mutable com2 = com
add2 <- fun x -> x
com.Force() 3 |> printfn "%A"
add2 <- fun x -> x + 4
com2.Force() 3 |> printfn "%A"
Unlike lazy, this does not do any caching, so calling Force twice will just do the whole thing twice. You could add some caching by tracking a dependency graph of the computation, but it gets more complicated.

Related

function with mutable arguments

in F#, it's not allowed to have mutable argument with functions. but if I have a function like the following:
let f x =
while x>0 do
printfn "%d" x
x <- x-1;;
when I compile this, I'm getting a compiler error.(not mutable)
How would I fix this function?
For pass-by-value semantics, you can use parameter shadowing.
Here, a shadowed (because it's the same name) mutable value x is declared on the stack.
let f x =
let mutable x = x
while x > 0 do
printfn "%d" x
x <- x - 1
For pass-by-reference semantics, you'll have to use a reference cell, which is a just a fancy way of passing an object which has a mutable field in it. It's similar to ref in C#:
let f x =
while !x > 0 do
printfn "%d" !x
x := !x - 1
Using a reference cell:
let x = (ref 10)
f x
Debug.Assert(!x = 0)
You could rewrite it like this:
let f x = [x.. -1..1] |> List.iter (printfn "%d")
If you want to keep the while loop, you could do this:
let f (x : byref<int>)=
while x>0 do
printfn "%d" x
x <- x-1
let mutable x = 5
f &x
Depends on whether you want the final, mutated value of x to be passed back to the caller of the function or mutate it only locally.
For local-only mutation, just declare another variable, which would be mutable, but initially will have the value of x:
let f x =
let mutable i = x
while i>0 do
printfn "%d" i
i <- i-1
For passing the result back to the caller, you can use a ref-cell:
let f (x: int ref) =
while x.Value>0 do
printfn "%d" x.Value
x := x.Value - 1
Note that now you have to refer to the ref-cell contents via the .Value property (or you can instead use operator !, as in x := !x - 1), and the mutation is now done via :=. Plus, the consumer of such function now has to create a ref-cell before passing it in:
let x = ref 5
f x
printfn "%d" x.Value // prints "0"
Having said that, I must point out that mutation is generally less reliable, more error-prone than pure values. The normal way to write "loops" in pure functional programming is via recursion:
let rec f x =
if x > 0 then
printf "%d" x
f (x-1)
Here, each call to f makes another call to f with the value of x decreased by 1. This will have the same effect as the loop, but now there is no mutation, which means easier debugging and testing.

Broadcasting operations in MathNet

Suppose I have the following data:
var1,var2,var3
0.942856823,0.568425866,0.325885379
1.227681099,1.335672206,0.925331054
1.952671045,1.829479996,1.512280854
2.45428731,1.990174152,1.534456808
2.987783477,2.78975186,1.725095748
3.651682331,2.966399127,1.972274564
3.768010479,3.211381506,1.993080807
4.509429614,3.642983433,2.541071547
4.81498729,3.888415006,3.218031802
Here is the code:
open System.IO
open MathNet.Numerics.LinearAlgebra
let rows = [|for line in File.ReadAllLines("Z:\\mypath.csv")
|> Seq.skip 1 do yield line.Split(',') |> Array.map float|]
let data = DenseMatrix.ofRowArrays rows
let data_logdiff =
DenseMatrix.init (data.RowCount-1) (data.ColumnCount)
(fun j i -> if j = 0 then 0. else data.At(j, i) / data.At(j-1, i) |> log)
let alpha = vector [for i in data_logdiff.EnumerateColumns() -> i |> Statistics.Mean]
let sigsq (values:Vector<float>) (avg: float) =
let sqr x = x * x
let result = values |> (fun i -> sqr (i - avg))
result
sigsq (data_logdiff.Column(i), alpha.[0]) |> printfn "%A"
Error: The type ''a * 'b' is not compatible with the type 'Vector<float>'
This is all for a broadcast operation between a matrix and a vector. All these acrobatics to do a simple mean((y-alpha).^2) in MATLAB.
You have a mistake in your code, and the F# compiler complains about it, albeit in a somewhat obscure way. You define your function:
let sigsq (values:Vector<float>) (avg: float) =
This is a function that takes two arguments. (Actually it's a function taking one argument, returning another function taking one argument.) But you call it like this:
sigsq (data_logdiff.Column(i), alpha.[0]) |> printfn "%A"
You tuple the arguments, and for F# functions (a,b) is one argument, which is a tuple. You should call your function like this:
sigsq (data_logdiff.Column(0)) (alpha.[0])
or
sigsq <| data_logdiff.Column(0) <| alpha.[0]
and my favorite one:
data_logdiff.Column(0) |> sigsq <| alpha.[0]
I replaced the (i) with 0 in your code. You can map through the columns if you want to loop:
data_logdiff.EnumerateColumnsIndexed() |> Seq.map (fun (i,col) -> sigsq col alpha.[i])

Understanding Mutability in F# : case study

I'm a beginner in F#, and this is my first attempt at programming something serious. I'm sorry the code is a bit long, but there are some issues with mutability that I don't understand.
This is an implementation of the Karger MinCut Algorithm to calculate the mincut on a non-directed graph component. I won't discuss here how the algo works,
for more info https://en.wikipedia.org/wiki/Karger%27s_algorithm
What is important is it's a randomized algorithm, which is running a determined number of trial runs, and taking the "best" run.
I realize now that I could avoid a lot of the problems below if I did construct a specific function for each random trial, but I'd like to understand EXACTLY what is wrong in the implementation below.
I'm running the code on this simple graph (the mincut is 2 when we cut the graph
into 2 components (1,2,3,4) and (5,6,7,8) with only 2 edges between those 2 components)
3--4-----5--6
|\/| |\/|
|/\| |/\|
2--1-----7--8
the file simplegraph.txt should encode this graph as follow
(1st column = node number, other columns = links)
1 2 3 4 7
2 1 3 4
3 1 2 4
4 1 2 3 5
5 4 6 7 8
6 5 7 8
7 1 5 6 8
8 5 6 7
This code may look too much as imperative programming yet, I'm sorry for that.
So There is a main for i loop calling each trial.
the first execution, (when i=1) looks smooth and perfect,
but I have runtime error execution when i=2, because it looks some variables,
like WG are not reinitialized correctly, causing out of bound errors.
WG, WG1 and WGmin are type wgraphobj, which are a record of Dictionary objects
WG1 is defined outside the main loop and i make no new assignments to WG1.
[but its type is mutable though, alas]
I defined first WG with the instruction
let mutable WG = WG1
then at the beginning of the for i loop,
i write
WG <- WG1
and then later, i modify the WG object in each trial to make some calculations.
when the trial is finished and we go to the next trial (i is increased) i want to reset WG to its initial state being like WG1.
but it seems its not working, and I don't get why...
Here is the full code
MyModule.fs [some functions not necessary for execution]
namespace MyModule
module Dict =
open System.Collections.Generic
let toSeq d = d |> Seq.map (fun (KeyValue(k,v)) -> (k,v))
let toArray (d:IDictionary<_,_>) = d |> toSeq |> Seq.toArray
let toList (d:IDictionary<_,_>) = d |> toSeq |> Seq.toList
let ofMap (m:Map<'k,'v>) = new Dictionary<'k,'v>(m) :> IDictionary<'k,'v>
let ofList (l:('k * 'v) list) = new Dictionary<'k,'v>(l |> Map.ofList) :> IDictionary<'k,'v>
let ofSeq (s:('k * 'v) seq) = new Dictionary<'k,'v>(s |> Map.ofSeq) :> IDictionary<'k,'v>
let ofArray (a:('k * 'v) []) = new Dictionary<'k,'v>(a |> Map.ofArray) :> IDictionary<'k,'v>
Karger.fs
open MyModule.Dict
open System.IO
let x = File.ReadAllLines "\..\simplegraph.txt";;
// val x : string [] =
let splitAtTab (text:string)=
text.Split [|'\t';' '|]
let splitIntoKeyValue (s:seq<'T>) =
(Seq.head s, Seq.tail s)
let parseLine (line:string)=
line
|> splitAtTab
|> Array.filter (fun s -> not(s=""))
|> Array.map (fun s-> (int s))
|> Array.toSeq
|> splitIntoKeyValue
let y =
x |> Array.map parseLine
open System.Collections.Generic
// let graph = new Map <int, int array>
let graphD = new Dictionary<int,int seq>()
y |> Array.iter graphD.Add
let graphM = y |> Map.ofArray //immutable
let N = y.Length // number of nodes
let Nruns = 2
let remove_table = new Dictionary<int,bool>()
[for i in 1..N do yield (i,false)] |> List.iter remove_table.Add
// let remove_table = seq [|for a in 1 ..N -> false|] // plus court
let label_head_table = new Dictionary<int,int>()
[for i in 1..N do yield (i,i)] |> List.iter label_head_table.Add
let label = new Dictionary<int,int seq>()
[for i in 1..N do yield (i,[i])] |> List.iter label.Add
let mutable min_cut = 1000000
type wgraphobj =
{ Graph : Dictionary<int,int seq>
RemoveTable : Dictionary<int,bool>
Label : Dictionary<int,int seq>
LabelHead : Dictionary<int,int> }
let WG1 = {Graph = graphD;
RemoveTable = remove_table;
Label = label;
LabelHead = label_head_table}
let mutable WGmin = WG1
let IsNotRemoved x = //
match x with
| (i,false) -> true
| (i,true) -> false
let IsNotRemoved1 WG i = //
(i,WG.RemoveTable.[i]) |>IsNotRemoved
let GetLiveNode d =
let myfun x =
match x with
| (i,b) -> i
d |> toList |> List.filter IsNotRemoved |> List.map myfun
let rand = System.Random()
// subsets a dictionary given a sub_list of keys
let D_Subset (dict:Dictionary<'T,'U>) (sub_list:list<'T>) =
let z = Dictionary<'T,'U>() // create new empty dictionary
sub_list |> List.filter (fun k -> dict.ContainsKey k)
|> List.map (fun k -> (k, dict.[k]))
|> List.iter (fun s -> z.Add s)
z
// subsets a dictionary given a sub_list of keys to remove
let D_SubsetC (dict:Dictionary<'T,'U>) (sub_list:list<'T>) =
let z = dict
sub_list |> List.filter (fun k -> dict.ContainsKey k)
|> List.map (fun k -> (dict.Remove k)) |>ignore
z
// subsets a sequence by values in a sequence
let S_Subset (S:seq<'T>)(sub_list:seq<'T>) =
S |> Seq.filter (fun s-> Seq.exists (fun elem -> elem = s) sub_list)
let S_SubsetC (S:seq<'T>)(sub_list:seq<'T>) =
S |> Seq.filter (fun s-> not(Seq.exists (fun elem -> elem = s) sub_list))
[<EntryPoint>]
let main argv =
let mutable u = 0
let mutable v = 0
let mutable r = 0
let mutable N_cut = 1000000
let mutable cluster_A_min = seq [0]
let mutable cluster_B_min = seq [0]
let mutable WG = WG1
let mutable LiveNodeList = [0]
// when i = 2, i encounter problems with mutability
for i in 1 .. Nruns do
WG <- WG1
printfn "%d" i
for k in 1..(N-2) do
LiveNodeList <- GetLiveNode WG.RemoveTable
r <- rand.Next(0,N-k)
u <- LiveNodeList.[r] //selecting a live node
let uuu = WG.Graph.[u] |> Seq.map (fun s -> WG.LabelHead.[s] )
|> Seq.filter (IsNotRemoved1 WG)
|> Seq.distinct
let n_edge = uuu |> Seq.length
let x = rand.Next(1,n_edge)
let mutable ok = false //maybe we can take this out
while not(ok) do
// selecting the edge from node u
v <- WG.LabelHead.[Array.get (uuu |> Seq.toArray) (x-1)]
let vvv = WG.Graph.[v] |> Seq.map (fun s -> WG.LabelHead.[s] )
|> Seq.filter (IsNotRemoved1 WG)
|> Seq.distinct
let zzz = S_SubsetC (Seq.concat [uuu;vvv] |> Seq.distinct) [u;v]
WG.Graph.[u] <- zzz
let lab_u = WG.Label.[u]
let lab_v = WG.Label.[v]
WG.Label.[u] <- Seq.concat [lab_u;lab_v] |> Seq.distinct
if (k<N-1) then
WG.RemoveTable.[v]<-true
//updating Label_head for all members of Label.[v]
WG.LabelHead.[v]<- u
for j in WG.Label.[v] do
WG.LabelHead.[j]<- u
ok <- true
printfn "u= %d v=%d" u v
// end of for k in 1..(N-2)
// counting cuts
// u,v contain the 2 indexes of groupings
let cluster_A = WG.Label.[u]
let cluster_B = S_SubsetC (seq[for i in 1..N do yield i]) cluster_A // defined as complementary of A
// let WG2 = {Graph = D_Subset WG1.Graph (cluster_A |> Seq.toList)
// RemoveTable = remove_table
// Label = D_Subset WG1.Graph (cluster_A |> Seq.toList)
// LabelHead = label_head_table}
let cross_edge = // returns keyvalue pair (k,S')
let IsInCluster cluster (k,S) =
(k,S_Subset S cluster)
graphM |> toSeq |> Seq.map (IsInCluster cluster_B)
N_cut <-
cross_edge |> Seq.map (fun (k:int,v:int seq)-> Seq.length v)
|> Seq.sum
if (N_cut<min_cut) then
min_cut <- N_cut
WGmin <- WG
cluster_A_min <- cluster_A
cluster_B_min <- cluster_B
// end of for i in 1..Nruns
0 // return an integer exit code
Description of the algo: (i don't think its too essential to solve my problem)
at each trial, there are several steps. at each step, we merge 2 nodes into 1, (removing effectively 1) updating the graph. we do that 6 times until there are only 2 nodes left, which we define as 2 clusters, and we look at the number of cross edges between those 2 clusters. if we are "lucky" those 2 clusters would be (1,2,3,4) and (5,6,7,8) and find the right number of cuts.
at each step, the object WG is updated with the effects of merging 2 nodes
with only LiveNodes (the ones which are not eliminated as a result of merging 2 nodes) being perfectly kept up to date.
WG.Graph is the updated graph
WG.Label contains the labels of the nodes which have been merged into the current node
WG.LabelHead contains the label of the node into which that node has been merged
WG.RemoveTable says if the node has been removed or not.
Thanks in advance for anyone willing to take a look at it !
"It seems not working", because wgraphobj is a reference type, which is allocated on the stack, which means that when you're mutating the innards of WG, you're also mutating the innards of WG1, because they're the same innards.
This is precisely the kind of mess you get yourself into if you use mutable state. This is why people recommend to not use it. In particular, your use of mutable dictionaries undermines the robustness of your algorithm. I recommend using the F#'s own efficient immutable dictionary (called Map) instead.
Now, in response to your comment about WG.Graph <- GraphD giving compile error.
WG is mutable, but WG.Graph is not (but the contents of WG.Graph are again mutable). There is a difference, let me try to explain it.
WG is mutable in the sense that it points to some object of type wgraphobj, but you can make it, in the course of your program, to point at another object of the same type.
WG.Graph, on the other hand, is a field packed inside WG. It points to some object of type Dictionary<_,_>. And you cannot make it point to another object. You can create a different wgraphobj, in which the field Graph point to a different dictionary, but you cannot change where the field Graph of the original wgraphobj points.
In order to make the field Graph itself mutable, you can declare it as such:
type wgraphobj = {
mutable Graph: Dictionary<int, int seq>
...
Then you will be able to mutate that field:
WG.Graph <- GraphD
Note that in this case you do not need to declare the value WG itself as mutable.
However, it seems to me that for your purposes you can actually go the way of creating a new instance wgraphobj with the field Graph changed, and assigning it to the mutable reference WG:
WG.Graph <- { WG with Graph = GraphD }

Tail Recursion in F# : Stack Overflow

I'm trying to implement Kosaraju's algorithm on a large graph
as part of an assignment [MOOC Algo I Stanford on Coursera]
https://en.wikipedia.org/wiki/Kosaraju%27s_algorithm
The current code works on a small graph, but I'm hitting Stack Overflow during runtime execution.
Despite having read the relevant chapter in Expert in F#, or other available examples on websites and SO, i still don't get how to use continuation to solve this problem
Below is the full code for general purpose, but it will already fail when executing DFSLoop1 and the recursive function DFSsub inside. I think I'm not making the function tail recursive [because of the instructions
t<-t+1
G.[n].finishingtime <- t
?]
but i don't understand how i can implement the continuation properly.
When considering only the part that fails, DFSLoop1 is taking as argument a graph to which we will apply Depth-First Search. We need to record the finishing time as part of the algo to proceed to the second part of the algo in a second DFS Loop (DFSLoop2) [of course we are failing before that].
open System
open System.Collections.Generic
open System.IO
let x = File.ReadAllLines "C:\Users\Fagui\Documents\GitHub\Learning Fsharp\Algo Stanford I\PA 4 - SCC.txt";;
// let x = File.ReadAllLines "C:\Users\Fagui\Documents\GitHub\Learning Fsharp\Algo Stanford I\PA 4 - test1.txt";;
// val x : string [] =
let splitAtTab (text:string)=
text.Split [|'\t';' '|]
let splitIntoKeyValue (A: int[]) =
(A.[0], A.[1])
let parseLine (line:string)=
line
|> splitAtTab
|> Array.filter (fun s -> not(s=""))
|> Array.map (fun s-> (int s))
|> splitIntoKeyValue
let y =
x |> Array.map parseLine
//val it : (int * int) []
type Children = int[]
type Node1 =
{children : Children ;
mutable finishingtime : int ;
mutable explored1 : bool ;
}
type Node2 =
{children : Children ;
mutable leader : int ;
mutable explored2 : bool ;
}
type DFSgraphcore = Dictionary<int,Children>
let directgraphcore = new DFSgraphcore()
let reversegraphcore = new DFSgraphcore()
type DFSgraph1 = Dictionary<int,Node1>
let reversegraph1 = new DFSgraph1()
type DFSgraph2 = Dictionary<int,Node2>
let directgraph2 = new DFSgraph2()
let AddtoGraph (G:DFSgraphcore) (n,c) =
if not(G.ContainsKey n) then
let node = [|c|]
G.Add(n,node)
else
let c'= G.[n]
G.Remove(n) |> ignore
G.Add (n, Array.append c' [|c|])
let inline swaptuple (a,b) = (b,a)
y|> Array.iter (AddtoGraph directgraphcore)
y|> Array.map swaptuple |> Array.iter (AddtoGraph reversegraphcore)
for i in directgraphcore.Keys do
if reversegraphcore.ContainsKey(i) then do
let node = {children = reversegraphcore.[i] ;
finishingtime = -1 ;
explored1 = false ;
}
reversegraph1.Add (i,node)
else
let node = {children = [||] ;
finishingtime = -1 ;
explored1 = false ;
}
reversegraph1.Add (i,node)
directgraphcore.Clear |> ignore
reversegraphcore.Clear |> ignore
// for i in reversegraph1.Keys do printfn "%d %A" i reversegraph1.[i].children
printfn "pause"
Console.ReadKey() |> ignore
let num_nodes =
directgraphcore |> Seq.length
let DFSLoop1 (G:DFSgraph1) =
let mutable t = 0
let mutable s = -1
let mutable k = num_nodes
let rec DFSsub (G:DFSgraph1)(n:int) (cont:int->int) =
//how to make it tail recursive ???
G.[n].explored1 <- true
// G.[n].leader <- s
for j in G.[n].children do
if not(G.[j].explored1) then DFSsub G j cont
t<-t+1
G.[n].finishingtime <- t
// end of DFSsub
for i in num_nodes .. -1 .. 1 do
printfn "%d" i
if not(G.[i].explored1) then do
s <- i
( DFSsub G i (fun s -> s) ) |> ignore
// printfn "%d %d" i G.[i].finishingtime
DFSLoop1 reversegraph1
printfn "pause"
Console.ReadKey() |> ignore
for i in directgraphcore.Keys do
let node = {children =
directgraphcore.[i]
|> Array.map (fun k -> reversegraph1.[k].finishingtime) ;
leader = -1 ;
explored2= false ;
}
directgraph2.Add (reversegraph1.[i].finishingtime,node)
let z = 0
let DFSLoop2 (G:DFSgraph2) =
let mutable t = 0
let mutable s = -1
let mutable k = num_nodes
let rec DFSsub (G:DFSgraph2)(n:int) (cont:int->int) =
G.[n].explored2 <- true
G.[n].leader <- s
for j in G.[n].children do
if not(G.[j].explored2) then DFSsub G j cont
t<-t+1
// G.[n].finishingtime <- t
// end of DFSsub
for i in num_nodes .. -1 .. 1 do
if not(G.[i].explored2) then do
s <- i
( DFSsub G i (fun s -> s) ) |> ignore
// printfn "%d %d" i G.[i].leader
DFSLoop2 directgraph2
printfn "pause"
Console.ReadKey() |> ignore
let table = [for i in directgraph2.Keys do yield directgraph2.[i].leader]
let results = table |> Seq.countBy id |> Seq.map snd |> Seq.toList |> List.sort |> List.rev
printfn "%A" results
printfn "pause"
Console.ReadKey() |> ignore
Here is a text file with a simple graph example
1 4
2 8
3 6
4 7
5 2
6 9
7 1
8 5
8 6
9 7
9 3
(the one which is causing overflow is 70Mo big with around 900,000 nodes)
EDIT
to clarify a few things first
Here is the "pseudo code"
Input: a directed graph G = (V,E), in adjacency list representation. Assume that the vertices V are labeled
1, 2, 3, . . . , n.
1. Let Grev denote the graph G after the orientation of all arcs have been reversed.
2. Run the DFS-Loop subroutine on Grev, processing vertices according to the given order, to obtain a
finishing time f(v) for each vertex v ∈ V .
3. Run the DFS-Loop subroutine on G, processing vertices in decreasing order of f(v), to assign a leader
to each vertex v ∈ V .
4. The strongly connected components of G correspond to vertices of G that share a common leader.
Figure 2: The top level of our SCC algorithm. The f-values and leaders are computed in the first and second
calls to DFS-Loop, respectively (see below).
Input: a directed graph G = (V,E), in adjacency list representation.
1. Initialize a global variable t to 0.
[This keeps track of the number of vertices that have been fully explored.]
2. Initialize a global variable s to NULL.
[This keeps track of the vertex from which the last DFS call was invoked.]
3. For i = n downto 1:
[In the first call, vertices are labeled 1, 2, . . . , n arbitrarily. In the second call, vertices are labeled by
their f(v)-values from the first call.]
(a) if i not yet explored:
i. set s := i
ii. DFS(G, i)
Figure 3: The DFS-Loop subroutine.
Input: a directed graph G = (V,E), in adjacency list representation, and a source vertex i ∈ V .
1. Mark i as explored.
[It remains explored for the entire duration of the DFS-Loop call.]
2. Set leader(i) := s
3. For each arc (i, j) ∈ G:
(a) if j not yet explored:
i. DFS(G, j)
4. t + +
5. Set f(i) := t
Figure 4: The DFS subroutine. The f-values only need to be computed during the first call to DFS-Loop, and
the leader values only need to be computed during the second call to DFS-Loop.
EDIT
i have amended the code, with the help of an experienced programmer (a lisper but who has no experience in F#) simplifying somewhat the first part to have more quickly an example without bothering about non-relevant code for this discussion.
The code focuses only on half of the algo, running DFS once to get finishing times of the reversed tree.
This is the first part of the code just to create a small example
y is the original tree. the first element of a tuple is the parent, the second is the child. But we will be working with the reverse tree
open System
open System.Collections.Generic
open System.IO
let x = File.ReadAllLines "C:\Users\Fagui\Documents\GitHub\Learning Fsharp\Algo Stanford I\PA 4 - SCC.txt";;
// let x = File.ReadAllLines "C:\Users\Fagui\Documents\GitHub\Learning Fsharp\Algo Stanford I\PA 4 - test1.txt";;
// val x : string [] =
let splitAtTab (text:string)=
text.Split [|'\t';' '|]
let splitIntoKeyValue (A: int[]) =
(A.[0], A.[1])
let parseLine (line:string)=
line
|> splitAtTab
|> Array.filter (fun s -> not(s=""))
|> Array.map (fun s-> (int s))
|> splitIntoKeyValue
// let y =
// x |> Array.map parseLine
//let y =
// [|(1, 4); (2, 8); (3, 6); (4, 7); (5, 2); (6, 9); (7, 1); (8, 5); (8, 6);
// (9, 7); (9, 3)|]
// let y = Array.append [|(1,1);(1,2);(2,3);(3,1)|] [|for i in 4 .. 10000 do yield (i,4)|]
let y = Array.append [|(1,1);(1,2);(2,3);(3,1)|] [|for i in 4 .. 99999 do yield (i,i+1)|]
//val it : (int * int) []
type Children = int list
type Node1 =
{children : Children ;
mutable finishingtime : int ;
mutable explored1 : bool ;
}
type Node2 =
{children : Children ;
mutable leader : int ;
mutable explored2 : bool ;
}
type DFSgraphcore = Dictionary<int,Children>
let directgraphcore = new DFSgraphcore()
let reversegraphcore = new DFSgraphcore()
type DFSgraph1 = Dictionary<int,Node1>
let reversegraph1 = new DFSgraph1()
let AddtoGraph (G:DFSgraphcore) (n,c) =
if not(G.ContainsKey n) then
let node = [c]
G.Add(n,node)
else
let c'= G.[n]
G.Remove(n) |> ignore
G.Add (n, List.append c' [c])
let inline swaptuple (a,b) = (b,a)
y|> Array.iter (AddtoGraph directgraphcore)
y|> Array.map swaptuple |> Array.iter (AddtoGraph reversegraphcore)
// définir reversegraph1 = ... with....
for i in reversegraphcore.Keys do
let node = {children = reversegraphcore.[i] ;
finishingtime = -1 ;
explored1 = false ;
}
reversegraph1.Add (i,node)
for i in directgraphcore.Keys do
if not(reversegraphcore.ContainsKey(i)) then do
let node = {children = [] ;
finishingtime = -1 ;
explored1 = false ;
}
reversegraph1.Add (i,node)
directgraphcore.Clear |> ignore
reversegraphcore.Clear |> ignore
// for i in reversegraph1.Keys do printfn "%d %A" i reversegraph1.[i].children
printfn "pause"
Console.ReadKey() |> ignore
let num_nodes =
directgraphcore |> Seq.length
So basically the graph is (1->2->3->1)::(4->5->6->7->8->....->99999->10000)
and the reverse graph is (1->3->2->1)::(10000->9999->....->4)
here is the main code written in direct style
//////////////////// main code is below ///////////////////
let DFSLoop1 (G:DFSgraph1) =
let mutable t = 0
let mutable s = -1
let rec iter (n:int) (f:'a->unit) (list:'a list) : unit =
match list with
| [] -> (t <- t+1) ; (G.[n].finishingtime <- t)
| x::xs -> f x ; iter n f xs
let rec DFSsub (G:DFSgraph1) (n:int) : unit =
let my_f (j:int) : unit = if not(G.[j].explored1) then (DFSsub G j)
G.[n].explored1 <- true
iter n my_f G.[n].children
for i in num_nodes .. -1 .. 1 do
// printfn "%d" i
if not(G.[i].explored1) then do
s <- i
DFSsub G i
printfn "%d %d" i G.[i].finishingtime
// End of DFSLoop1
DFSLoop1 reversegraph1
printfn "pause"
Console.ReadKey() |> ignore
its not tail recursive, so we use continuations, here is the same code adapted to CPS style:
//////////////////// main code is below ///////////////////
let DFSLoop1 (G:DFSgraph1) =
let mutable t = 0
let mutable s = -1
let rec iter_c (n:int) (f_c:'a->(unit->'r)->'r) (list:'a list) (cont: unit->'r) : 'r =
match list with
| [] -> (t <- t+1) ; (G.[n].finishingtime <- t) ; cont()
| x::xs -> f_c x (fun ()-> iter_c n f_c xs cont)
let rec DFSsub (G:DFSgraph1) (n:int) (cont: unit->'r) : 'r=
let my_f_c (j:int)(cont:unit->'r):'r = if not(G.[j].explored1) then (DFSsub G j cont) else cont()
G.[n].explored1 <- true
iter_c n my_f_c G.[n].children cont
for i in maxnum_nodes .. -1 .. 1 do
// printfn "%d" i
if not(G.[i].explored1) then do
s <- i
DFSsub G i id
printfn "%d %d" i G.[i].finishingtime
DFSLoop1 reversegraph1
printfn "faré"
printfn "pause"
Console.ReadKey() |> ignore
both codes compile and give the same results for the small example (the one in comment) or the same tree that we are using , with a smaller size (1000 instead of 100000)
so i don't think its a bug in the algo here, we've got the same tree structure, just a bigger tree is causing problems. it looks to us the continuations are well written. we've typed the code explicitly. and all calls end with a continuation in all cases...
We are looking for expert advice !!! thanks !!!
I did not try to understand the whole code snippet, because it is fairly long, but you'll certainly need to replace the for loop with an iteration implemented using continuation passing style. Something like:
let rec iterc f cont list =
match list with
| [] -> cont ()
| x::xs -> f x (fun () -> iterc f cont xs)
I didn't understand the purpose of cont in your DFSub function (it is never called, is it?), but the continuation based version would look roughly like this:
let rec DFSsub (G:DFSgraph2)(n:int) cont =
G.[n].explored2 <- true
G.[n].leader <- s
G.[n].children
|> iterc
(fun j cont -> if not(G.[j].explored2) then DFSsub G j cont else cont ())
(fun () -> t <- t + 1)
Overflowing the stack when you recurse through hundreds of thousands of entries isn't bad at all, really. A lot of programming language implementations will choke on much shorter recursions than that. You're having serious programmer problems — nothing to be ashamed of!
Now if you want to do deeper recursions than your implementation will handle, you need to transform your algorithm so it is iterative and/or tail-recursive (the two are isomorphic — except that tail-recursion allows for decentralization and modularity, whereas iteration is centralized and non-modular).
To transform an algorithm from recursive to tail-recursive, which is an important skill to possess, you need to understand the state that is implicitly stored in a stack frame, i.e. those free variables in the function body that change across the recursion, and explicitly store them in a FIFO queue (a data structure that replicates your stack, and can be implemented trivially as a linked list). Then you can pass that linked list of reified frame variables as an argument to your tail recursive functions.
In more advanced cases where you have many tail recursive functions each with a different kind of frame, instead of simple self-recursion, you may need to define some mutually recursive data types for the reified stack frames, instead of using a list. But I believe Kosaraju's algorithm only involves self-recursive functions.
OK, so the code given above was the RIGHT code !
the problem lies with the compiler of F#
here is some words about it from Microsoft
http://blogs.msdn.com/b/fsharpteam/archive/2011/07/08/tail-calls-in-fsharp.aspx
Basically, be careful with the settings, in default mode, the compiler may NOT make automatically the tail calls. To do so, in VS2015, go to the Solution Explorer, right click with the mouse and click on "Properties" (the last element of the scrolling list)
Then in the new window, click on "Build" and tick the box "Generate tail calls"
It is also to check if the compiler did its job looking at the disassembly using
ILDASM.exe
you can find the source code for the whole algo in my github repository
https://github.com/FaguiCurtain/Learning-Fsharp/blob/master/Algo%20Stanford/Algo%20Stanford/Kosaraju_cont.fs
on a performance point of view, i'm not very satisfied. The code runs on 36 seconds on my laptop. From the forum with other fellow MOOCers, C/C++/C# typically executes in subsecond to 5s, Java around 10-15, Python around 20-30s.
So my implementation is clearly not optimized. I am now happy to hear about tricks to make it faster !!! thanks !!!!

Pass function as a parameter and overload it

I want to time my functions, some of them use up to three parameters. Right now I'm using the same code below with some variations for the three.
let GetTime f (args : string) =
let sw = Stopwatch.StartNew()
f (args)
printfn "%s : %A" sw.Elapsed
I want to replace the three functions with this one.
let GetTime f ( args : 'T[]) =
let sW = Stopwatch.StartNew()
match args.Length with
| 1 -> f args.[0]
| 2 -> f (args.[0] args.[1])
printfn "%A" sW.Elapsed
()
But I'm getting an error of type mismatch, if I use the three functions it works. Is it possible to send the function as a parameter and use it like this?
Why not just do something like this?
let getTime f =
let sw = Stopwatch.StartNew()
let result = f ()
printfn "%A" sw.Elapsed
result
Assuming that f1, f2, and f3 are three functions that take respectively 1, 2, and 3 arguments, you can use the getTime function like this:
getTime (fun () -> f1 "foo")
getTime (fun () -> f2 "foo" "bar")
getTime (fun () -> f3 "foo" "bar" "baz")
However, if you just need to time some functions in FSI, this feature is already built-in: just type
> #time;;
and timing will be turned on.
It isn't possible for the compiler to know how many arguments will be passed at runtime, so the function f must satisfy both 'T -> unit and 'T -> 'T -> unit. This form also requires all arguments to be of the same type.
The following approach delays the function execution and may be suitable for your needs.
let printTime f =
let sw = Stopwatch.StartNew()
f() |> ignore
printfn "%A" sw.Elapsed
let f1 s = String.length s
let f2 s c = String.concat c s
printTime (fun () -> f1 "Test")
printTime (fun () -> f2 [| "Test1"; "Test2" |] ",")
You're probably thinking of passing a method group as an argument to GetTime, and then having the compiler decide which overload of the method group to call. That's not possible with any .NET compiler. Method groups are used for code analysis by compilers and tools such as ReSharper, but they are not something that actually exists at runtime.
If your functions take their arguments in tupled form, like these:
let f1 (s: string, b: bool) =
System.Threading.Thread.Sleep 1000
s
let f2 (n: int, s:string, dt: System.DateTime) =
System.Threading.Thread.Sleep 1000
n+1
then the implementation becomes trivial:
let Timed f args =
let sw = System.Diagnostics.Stopwatch.StartNew()
let ret = f args
printfn "Called with arguments %A, elapsed %A" args sw.Elapsed
ret
Usage:
f1
|> Timed // note, at this time we haven't yet applied any arguments
<| ("foo", true)
|> printfn "f1 done, returned %A"
f2
|> Timed
<| (42, "bar", DateTime.Now)
|> printfn "f2 done, returned %A"
However, if the functions take their arguments in curried form, like this:
let f1Curried (s: string) (b: bool) =
System.Threading.Thread.Sleep 1000
s
let f2Curried (n: int) (s:string) (dt: System.DateTime) =
System.Threading.Thread.Sleep 1000
n+1
it becomes a bit tricky. The idea is using standard operators (<|), (<||), and (<|||) that are intended to uncurry the arguments.
let Timed2 op f args =
let sw = System.Diagnostics.Stopwatch.StartNew()
let ret = op f args
printfn "Called with arguments %A, elapsed %A" args sw.Elapsed
ret
f1Curried
|> Timed2 (<||) // again, no arguments are passed yet
<| ("foo", true)
|> printfn "f1Curried done, returned %A"
f2Curried
|> Timed2 (<|||)
<| (42, "bar", DateTime.Now)
|> printfn "f2Curried done, returned %A"

Resources