F# suitable container for (string, float, float) triads? - f#

I have the following problem and I hope somebody can help me.
Short description of the problem: i need to store a (string A, float B, float C) triad into a suitable container. The triad originates fomr a double "for" loop.
But the essential point is that I will need to slice this container when the loops are over to perform other operations.
An example that can be executed from the .fsx shell (using Deedle frames) follows. The triad is what is beeing printed on the screen.
open Deedle
let categorical_variable = [| "A"; "B"; "C"; "A"; "B"; "C"; |]
let vec_1 = [| 15.5; 14.3; 15.5; 14.3; 15.5; 14.3; |]
let vec_2 = [| 114.3; 17.5; 9.3; 88.7; 115.5; 12.3; |]
let dframe = frame ["cat" =?> Series.ofValues categorical_variable
"v1" =?> Series.ofValues vec_1
"v2" =?> Series.ofValues vec_2 ]
let distinct_categorical_variables = categorical_variable |> Array.toSeq |> Seq.distinct |> Seq.toArray
let mutable frame_slice : Frame<int, string> = Frame.ofRows []
let mutable frame_slice_vec_1 : float[] = Array.empty
let mutable frame_slice_vec_1_distinct : float[] = Array.empty
for cat_var in distinct_categorical_variables do
frame_slice <- (dframe |> Frame.filterRowValues (fun row -> row.GetAs "cat" = cat_var))
frame_slice_vec_1 <- (frame_slice?v1).Values |> Seq.toArray
frame_slice_vec_1_distinct <- (frame_slice_vec_1 |> Array.toSeq |> Seq.distinct |> Seq.toArray)
for vec_1_iter in frame_slice_vec_1_distinct do
printfn "%s, %f, %f \n" cat_var vec_1_iter (Array.average ((frame_slice?v2).Values |> Seq.toArray) ) |> ignore
So, is there any suitable object where to store this triad? I saw Array3d objects, but I don't think they are the right solution cause A, B and C of my triad have different types.
Many thanks in advance.

you probably want a sequence expression with tuples:
let mySequence =
seq { for cat_var in distinct_categorical_variables do
...
for vec_1_iter in ... do
yield cat_var, vec_1_iter, Array.average ... }
// then use it like
for cat_var, vec_1_iter, result in mySequence do
...

Related

Understanding Mutability in F# : case study

I'm a beginner in F#, and this is my first attempt at programming something serious. I'm sorry the code is a bit long, but there are some issues with mutability that I don't understand.
This is an implementation of the Karger MinCut Algorithm to calculate the mincut on a non-directed graph component. I won't discuss here how the algo works,
for more info https://en.wikipedia.org/wiki/Karger%27s_algorithm
What is important is it's a randomized algorithm, which is running a determined number of trial runs, and taking the "best" run.
I realize now that I could avoid a lot of the problems below if I did construct a specific function for each random trial, but I'd like to understand EXACTLY what is wrong in the implementation below.
I'm running the code on this simple graph (the mincut is 2 when we cut the graph
into 2 components (1,2,3,4) and (5,6,7,8) with only 2 edges between those 2 components)
3--4-----5--6
|\/| |\/|
|/\| |/\|
2--1-----7--8
the file simplegraph.txt should encode this graph as follow
(1st column = node number, other columns = links)
1 2 3 4 7
2 1 3 4
3 1 2 4
4 1 2 3 5
5 4 6 7 8
6 5 7 8
7 1 5 6 8
8 5 6 7
This code may look too much as imperative programming yet, I'm sorry for that.
So There is a main for i loop calling each trial.
the first execution, (when i=1) looks smooth and perfect,
but I have runtime error execution when i=2, because it looks some variables,
like WG are not reinitialized correctly, causing out of bound errors.
WG, WG1 and WGmin are type wgraphobj, which are a record of Dictionary objects
WG1 is defined outside the main loop and i make no new assignments to WG1.
[but its type is mutable though, alas]
I defined first WG with the instruction
let mutable WG = WG1
then at the beginning of the for i loop,
i write
WG <- WG1
and then later, i modify the WG object in each trial to make some calculations.
when the trial is finished and we go to the next trial (i is increased) i want to reset WG to its initial state being like WG1.
but it seems its not working, and I don't get why...
Here is the full code
MyModule.fs [some functions not necessary for execution]
namespace MyModule
module Dict =
open System.Collections.Generic
let toSeq d = d |> Seq.map (fun (KeyValue(k,v)) -> (k,v))
let toArray (d:IDictionary<_,_>) = d |> toSeq |> Seq.toArray
let toList (d:IDictionary<_,_>) = d |> toSeq |> Seq.toList
let ofMap (m:Map<'k,'v>) = new Dictionary<'k,'v>(m) :> IDictionary<'k,'v>
let ofList (l:('k * 'v) list) = new Dictionary<'k,'v>(l |> Map.ofList) :> IDictionary<'k,'v>
let ofSeq (s:('k * 'v) seq) = new Dictionary<'k,'v>(s |> Map.ofSeq) :> IDictionary<'k,'v>
let ofArray (a:('k * 'v) []) = new Dictionary<'k,'v>(a |> Map.ofArray) :> IDictionary<'k,'v>
Karger.fs
open MyModule.Dict
open System.IO
let x = File.ReadAllLines "\..\simplegraph.txt";;
// val x : string [] =
let splitAtTab (text:string)=
text.Split [|'\t';' '|]
let splitIntoKeyValue (s:seq<'T>) =
(Seq.head s, Seq.tail s)
let parseLine (line:string)=
line
|> splitAtTab
|> Array.filter (fun s -> not(s=""))
|> Array.map (fun s-> (int s))
|> Array.toSeq
|> splitIntoKeyValue
let y =
x |> Array.map parseLine
open System.Collections.Generic
// let graph = new Map <int, int array>
let graphD = new Dictionary<int,int seq>()
y |> Array.iter graphD.Add
let graphM = y |> Map.ofArray //immutable
let N = y.Length // number of nodes
let Nruns = 2
let remove_table = new Dictionary<int,bool>()
[for i in 1..N do yield (i,false)] |> List.iter remove_table.Add
// let remove_table = seq [|for a in 1 ..N -> false|] // plus court
let label_head_table = new Dictionary<int,int>()
[for i in 1..N do yield (i,i)] |> List.iter label_head_table.Add
let label = new Dictionary<int,int seq>()
[for i in 1..N do yield (i,[i])] |> List.iter label.Add
let mutable min_cut = 1000000
type wgraphobj =
{ Graph : Dictionary<int,int seq>
RemoveTable : Dictionary<int,bool>
Label : Dictionary<int,int seq>
LabelHead : Dictionary<int,int> }
let WG1 = {Graph = graphD;
RemoveTable = remove_table;
Label = label;
LabelHead = label_head_table}
let mutable WGmin = WG1
let IsNotRemoved x = //
match x with
| (i,false) -> true
| (i,true) -> false
let IsNotRemoved1 WG i = //
(i,WG.RemoveTable.[i]) |>IsNotRemoved
let GetLiveNode d =
let myfun x =
match x with
| (i,b) -> i
d |> toList |> List.filter IsNotRemoved |> List.map myfun
let rand = System.Random()
// subsets a dictionary given a sub_list of keys
let D_Subset (dict:Dictionary<'T,'U>) (sub_list:list<'T>) =
let z = Dictionary<'T,'U>() // create new empty dictionary
sub_list |> List.filter (fun k -> dict.ContainsKey k)
|> List.map (fun k -> (k, dict.[k]))
|> List.iter (fun s -> z.Add s)
z
// subsets a dictionary given a sub_list of keys to remove
let D_SubsetC (dict:Dictionary<'T,'U>) (sub_list:list<'T>) =
let z = dict
sub_list |> List.filter (fun k -> dict.ContainsKey k)
|> List.map (fun k -> (dict.Remove k)) |>ignore
z
// subsets a sequence by values in a sequence
let S_Subset (S:seq<'T>)(sub_list:seq<'T>) =
S |> Seq.filter (fun s-> Seq.exists (fun elem -> elem = s) sub_list)
let S_SubsetC (S:seq<'T>)(sub_list:seq<'T>) =
S |> Seq.filter (fun s-> not(Seq.exists (fun elem -> elem = s) sub_list))
[<EntryPoint>]
let main argv =
let mutable u = 0
let mutable v = 0
let mutable r = 0
let mutable N_cut = 1000000
let mutable cluster_A_min = seq [0]
let mutable cluster_B_min = seq [0]
let mutable WG = WG1
let mutable LiveNodeList = [0]
// when i = 2, i encounter problems with mutability
for i in 1 .. Nruns do
WG <- WG1
printfn "%d" i
for k in 1..(N-2) do
LiveNodeList <- GetLiveNode WG.RemoveTable
r <- rand.Next(0,N-k)
u <- LiveNodeList.[r] //selecting a live node
let uuu = WG.Graph.[u] |> Seq.map (fun s -> WG.LabelHead.[s] )
|> Seq.filter (IsNotRemoved1 WG)
|> Seq.distinct
let n_edge = uuu |> Seq.length
let x = rand.Next(1,n_edge)
let mutable ok = false //maybe we can take this out
while not(ok) do
// selecting the edge from node u
v <- WG.LabelHead.[Array.get (uuu |> Seq.toArray) (x-1)]
let vvv = WG.Graph.[v] |> Seq.map (fun s -> WG.LabelHead.[s] )
|> Seq.filter (IsNotRemoved1 WG)
|> Seq.distinct
let zzz = S_SubsetC (Seq.concat [uuu;vvv] |> Seq.distinct) [u;v]
WG.Graph.[u] <- zzz
let lab_u = WG.Label.[u]
let lab_v = WG.Label.[v]
WG.Label.[u] <- Seq.concat [lab_u;lab_v] |> Seq.distinct
if (k<N-1) then
WG.RemoveTable.[v]<-true
//updating Label_head for all members of Label.[v]
WG.LabelHead.[v]<- u
for j in WG.Label.[v] do
WG.LabelHead.[j]<- u
ok <- true
printfn "u= %d v=%d" u v
// end of for k in 1..(N-2)
// counting cuts
// u,v contain the 2 indexes of groupings
let cluster_A = WG.Label.[u]
let cluster_B = S_SubsetC (seq[for i in 1..N do yield i]) cluster_A // defined as complementary of A
// let WG2 = {Graph = D_Subset WG1.Graph (cluster_A |> Seq.toList)
// RemoveTable = remove_table
// Label = D_Subset WG1.Graph (cluster_A |> Seq.toList)
// LabelHead = label_head_table}
let cross_edge = // returns keyvalue pair (k,S')
let IsInCluster cluster (k,S) =
(k,S_Subset S cluster)
graphM |> toSeq |> Seq.map (IsInCluster cluster_B)
N_cut <-
cross_edge |> Seq.map (fun (k:int,v:int seq)-> Seq.length v)
|> Seq.sum
if (N_cut<min_cut) then
min_cut <- N_cut
WGmin <- WG
cluster_A_min <- cluster_A
cluster_B_min <- cluster_B
// end of for i in 1..Nruns
0 // return an integer exit code
Description of the algo: (i don't think its too essential to solve my problem)
at each trial, there are several steps. at each step, we merge 2 nodes into 1, (removing effectively 1) updating the graph. we do that 6 times until there are only 2 nodes left, which we define as 2 clusters, and we look at the number of cross edges between those 2 clusters. if we are "lucky" those 2 clusters would be (1,2,3,4) and (5,6,7,8) and find the right number of cuts.
at each step, the object WG is updated with the effects of merging 2 nodes
with only LiveNodes (the ones which are not eliminated as a result of merging 2 nodes) being perfectly kept up to date.
WG.Graph is the updated graph
WG.Label contains the labels of the nodes which have been merged into the current node
WG.LabelHead contains the label of the node into which that node has been merged
WG.RemoveTable says if the node has been removed or not.
Thanks in advance for anyone willing to take a look at it !
"It seems not working", because wgraphobj is a reference type, which is allocated on the stack, which means that when you're mutating the innards of WG, you're also mutating the innards of WG1, because they're the same innards.
This is precisely the kind of mess you get yourself into if you use mutable state. This is why people recommend to not use it. In particular, your use of mutable dictionaries undermines the robustness of your algorithm. I recommend using the F#'s own efficient immutable dictionary (called Map) instead.
Now, in response to your comment about WG.Graph <- GraphD giving compile error.
WG is mutable, but WG.Graph is not (but the contents of WG.Graph are again mutable). There is a difference, let me try to explain it.
WG is mutable in the sense that it points to some object of type wgraphobj, but you can make it, in the course of your program, to point at another object of the same type.
WG.Graph, on the other hand, is a field packed inside WG. It points to some object of type Dictionary<_,_>. And you cannot make it point to another object. You can create a different wgraphobj, in which the field Graph point to a different dictionary, but you cannot change where the field Graph of the original wgraphobj points.
In order to make the field Graph itself mutable, you can declare it as such:
type wgraphobj = {
mutable Graph: Dictionary<int, int seq>
...
Then you will be able to mutate that field:
WG.Graph <- GraphD
Note that in this case you do not need to declare the value WG itself as mutable.
However, it seems to me that for your purposes you can actually go the way of creating a new instance wgraphobj with the field Graph changed, and assigning it to the mutable reference WG:
WG.Graph <- { WG with Graph = GraphD }

F# split array of strings and return the result of split

let total = [| "1X2"; "3X4"; "5X6" |]
let oddEven = total
|> Array.map(fun x -> x.Split('X'))
I have an array of string, which is total in above example, I want to split the array by "X", as the oddEven in the above example, but I want to return 2 arrays of strings:
let odd = [| 1; 3; 5 |] and let even = [| 2; 4; 6 |]
It could be an easy task, but I can not figure it out now.
Any help is greatly appreciated!
Thanks,
You should check whether each string can split into two pieces, and unzip the result:
let total = [| "1X2"; "3X4"; "5X6" |]
let odds, evens = total |> Array.map (fun x -> match x.Split('X') with
| [|odd; even|] -> odd, even
| _ -> failwith "Wrong input")
|> Array.unzip;;
let evens, odds = total
|> (Array.map (fun x -> x.Split('X')))
|> Array.concat
|> Array.partition (fun s -> int s % 2 = 0)
EDIT: As John Palmer points out in the comments, you can use Array.collect instead of map and concat:
let evens, odds = total
|> Array.collect (fun s -> s.Split('X'))
|> Array.partition (fun s -> int s % 2 = 0);;
let odd =
oddEven |> Array.map (fun x -> x.[0])
let even =
oddEven |> Array.map (fun x -> x.[1])

f# array.filter based on a bool array

if I have array A, and I have another bool array isChosen with the same length of A how can I build a new array from A where isChosen is true? something like A.[isChosen]? I cannot use Array.filter directly since isChosen is not a function of A elements and there is no Array.filteri like Array.mapi.
zip should help:
let l = [|1;2;3|]
let f = [|true; false; true|]
let r = [| for (v, f) in Seq.zip l f do if f then yield v|]
// or
let r = (l, f) ||> Seq.zip |> Seq.filter snd |> Seq.map fst |> Seq.toArray
Try the zip operator
seq.zip A isChosen
|> Seq.filter snd
|> Seq.map fst
|> Array.ofSeq
This will create a sequence of tuples where one value is from A and the other is from isChosen. This will pair the values together and make it very easy to filter them out in a Seq.filter expression
It's not as elegant or 'functional' as the other answers, but every once in a while I like a gentle reminder that you can use loops and array indices in F#:
let A = [|1;2;3|]
let isChosen = [|true; false; true|]
let r = [| for i in 0..A.Length-1 do
if isChosen.[i] then
yield A.[i] |]
printfn "%A" r
:)
And here are two more ways, just to demonstrate (even) more F# library functions:
let A = [|1;2;3|]
let isChosen = [|true;false;true|]
let B = Seq.map2 (fun x b -> if b then Some x else None) A isChosen
|> Seq.choose id
|> Seq.toArray
let C = Array.foldBack2 (fun x b acc -> if b then x::acc else acc) A isChosen []
|> List.toArray
My personal favorite for understandability (and therefore maintainability): desco's answer
let r = [| for (v, f) in Seq.zip l f do if f then yield v|]

How to "convert" a Dictionary into a sequence in F#?

How do I "convert" a Dictionary into a sequence so that I can sort by key value?
let results = new Dictionary()
results.Add("George", 10)
results.Add("Peter", 5)
results.Add("Jimmy", 9)
results.Add("John", 2)
let ranking =
results
???????
|> Seq.Sort ??????
|> Seq.iter (fun x -> (... some function ...))
A System.Collections.Dictionary<K,V> is an IEnumerable<KeyValuePair<K,V>>, and the F# Active Pattern 'KeyValue' is useful for breaking up KeyValuePair objects, so:
open System.Collections.Generic
let results = new Dictionary<string,int>()
results.Add("George", 10)
results.Add("Peter", 5)
results.Add("Jimmy", 9)
results.Add("John", 2)
results
|> Seq.sortBy (fun (KeyValue(k,v)) -> k)
|> Seq.iter (fun (KeyValue(k,v)) -> printfn "%s: %d" k v)
You may also find the dict function useful. Let F# do some type inference for you:
let results = dict ["George", 10; "Peter", 5; "Jimmy", 9; "John", 2]
> val results : System.Collections.Generic.IDictionary<string,int>
Another option, which doesn't need a lambda until the end
dict ["George", 10; "Peter", 5; "Jimmy", 9; "John", 2]
|> Seq.map (|KeyValue|)
|> Seq.sortBy fst
|> Seq.iter (fun (k,v) -> ())
with help from https://gist.github.com/theburningmonk/3363893

How do I concatenate a list of strings in F#?

I'm trying this at the moment, but I haven't quite got the method signature worked out... anyone? messages is a field of seq[string]
let messageString = List.reduce(messages, fun (m1, m2) -> m1 + m2 + Environment.NewLine)
> String.concat " " ["Juliet"; "is"; "awesome!"];;
val it : string = "Juliet is awesome!"
Not exactly what you're looking for, but
let strings = [| "one"; "two"; "three" |]
let r = System.String.Concat(strings)
printfn "%s" r
You can do
let strings = [ "one"; "two"; "three" ]
let r = strings |> List.fold (+) ""
printfn "%s" r
or
let strings = [ "one"; "two"; "three" ]
let r = strings |> List.fold (fun r s -> r + s + "\n") ""
printfn "%s" r
I'd use String.concat unless you need to do fancier formatting and then I'd use StringBuilder.
(StringBuilder(), [ "one"; "two"; "three" ])
||> Seq.fold (fun sb str -> sb.AppendFormat("{0}\n", str))
just one more comment,
when you are doing with string, you'd better use standard string functions.
The following code is for EulerProject problem 40.
let problem40 =
let str = {1..1000000} |> Seq.map string |> String.concat ""
let l = [str.[0];str.[9];str.[99];str.[999];str.[9999];str.[99999];str.[999999];]
l |> List.map (fun x-> (int x) - (int '0')) |> List.fold (*) 1
if the second line of above program uses fold instead of concat, it would be extremely slow because each iteration of fold creates a new long string.
System.String.Join(Environment.NewLine, List.to_array messages)
or using your fold (note that it's much more inefficient)
List.reduce (fun a b -> a ^ Environment.NewLine ^ b) messages

Resources