fst and 3-tuple in fsharp - f#

Do you know the nicest way to make this work :
let toTableau2D (seqinit:seq<'a*'b*'c>) =
let myfst = fun (a,b,c) -> a
let myscd = fun (a,b,c) -> b
let mytrd = fun (a,b,c) -> c
let inputd = seqinit |> groupBy2 myfst myscd
there must be a better way than rewriting fst..
UPDATE
After pad advice, I rewrote packing the previous 'a*'b into a single structure
My code now looks like
let toTableau (seqinit:seq<'a*'b>) =
let inputd = seqinit |> Seq.groupBy fst |> toMap
let keys = seqinit |> Seq.map fst |> Set.ofSeq |> List.ofSeq
...

Why don't you just write it explicitly:
let toTableau2D (a, b, c) =
let toto = a
// ...
If you want to refer to seqinit later on, you always can reconstruct the triple or use the named pattern:
let toTableau2D ((a, b, c) as seqinit) =
let toto = a
// Do something with seqinit
// ...
EDIT:
Unless you use reflection, you cannot have fst function for any kind of tuples. In your example, writing some utility functions and reusing them doesn't hurt:
let fst3 (a, _, _) = a
let snd3 (_, b, _) = b
let thd3 (_, _, c) = c
let toTableau2D (seqinit: seq<'a*'b*'c>) =
let inputd = seqinit |> groupBy2 fst3 snd3
// ...
If you want to make this work for arbitrary number of tuple elements, consider changing tuples to lists and employing pattern matching on lists.

+1 to what #pad said. Otherwise (if you just simplified what you're trying to do and are stuck with seqinit defined that way) I guess you can always do:
let toTableau2D (seqinit:'a*'b*'c) =
let toto, _, _ = seqinit
//...

Related

Trying to create a function, and then filtering a sequence by "not that function" in F#

My data is a SEQUENCE of:
[(40,"TX");(48,"MO");(15,"TX");(78,"TN");(41,"VT")]
My code is as follows:
type Csvfile = CsvProvider<somefile>
let data = Csvfile.GetSample().Rows
let nullid row =
row.Id = 15
let otherid row =
row.Id= 40
let iddata =
data
|> Seq.filter (not nullid)
|> Seq.filter (not otherid)
I create the functions.
Then I want to call the "not" of those functions to filter them out of a sequence.
But the issue is that I am getting errors for "row.Id" in the first two functions, because you can only do that with a type.
How do I solve this problem so I can accomplish this successfully.
My result should be a SEQUENCE of:
[(48,"MO);(78,"TN");(41,"VT")]
You can use >> operator to compose the two functions:
let iddata =
data
|> Seq.filter (nullid >> not)
|> Seq.filter (othered >> not)
See Function Composition and Pipelining.
Or you can make it more explicit:
let iddata =
data
|> Seq.filter (fun x -> not (nullid x))
|> Seq.filter (fun x -> not (othered x))
You can see that in action:
let input = [|1;2;3;4;5;6;7;8;9;10|];;
let is3 value =
value = 3;;
input |> Seq.filter (fun x -> not (is3 x));;
input |> Seq.filter (not >> is3);;
They both print val it : seq<int> = seq [1; 2; 4; 5; ...]
Please see below what an MCVE might look in your case, for an fsx file you can reference the Fsharp.Data dll with #r, for a compiled project just reference the dll an open it.
#if INTERACTIVE
#r #"..\..\SO2018\packages\FSharp.Data\lib\net45\FSharp.Data.dll"
#endif
open FSharp.Data
[<Literal>]
let datafile = #"C:\tmp\data.csv"
type CsvFile = CsvProvider<datafile>
let data = CsvFile.GetSample().Rows
In the end this is what you want to achieve:
data
|> Seq.filter (fun x -> x.Id <> 15)
|> Seq.filter (fun x -> x.Id <> 40)
//val it : seq<CsvProvider<...>.Row> = seq [(48, "MO"); (78, "TN"); (41, "VT")]
One way to do this is with SRTP, as they allow a way to do structural typing, where the type depends on its shape, for example in this case having the Id property. If you want you can define helper function for the two numbers 15 and 40, and use that in your filter, just like in the second example. However SRTP syntax is a bit strange, and it's designed for a use case where you need to apply a function to different types that have some similarity (basically like interfaces).
let inline getId row =
(^T : (member Id : int) row)
data
|> Seq.filter (fun x -> (getId x <> 15 ))
|> Seq.filter (fun x -> (getId x <> 40))
//val it : seq<CsvProvider<...>.Row> = seq [(48, "MO"); (78, "TN"); (41, "VT")]
Now back to your original post, as you correctly point out your function will show an error, as you define it to be generic, but it needs to operate on a specific Csv row type (that has the Id property). This is very easy to fix, just add a type annotation to the row parameter. In this case your type is CsvFile.Row, and since CsvFile.Row has the Id property we can access that in the function. Now this function returns a Boolean. You could make it return the actual row as well.
let nullid (row: CsvFile.Row) =
row.Id = 15
let otherid (row: CsvFile.Row) =
row.Id = 40
Then what is left is applying this inside a Seq.filter and negating it:
let iddata =
data
|> Seq.filter (not << nullid)
|> Seq.filter (not << otherid)
|> Seq.toList
//val iddata : CsvProvider<...>.Row list = [(48, "MO"); (78, "TN"); (41, "VT")]

F# Need to rewrite code to not require a mutable variable

I have finished the project I have been working on but I am wanting to go back and cleanup my code. In this one instance I used a mutable variable however I want my code to contain no mutable variables. How would I rewrite this code section to return a bool but have it not mutable?
let mutable duplicates = false
for el in (combo|>Seq.head) do
let exists = Seq.exists (fun x -> x = el) (combo|>Seq.item 1)
duplicates <- exists
Any help would be appreciated, cheers!
let t = Seq.item 1 combo
let duplicates = Seq.head combo |> Seq.exists (fun el -> Seq.contains el t)
The usual caveats about handling seqs in this manner apply.
let s1 = combo |> Seq.head
let s2 = combo |> Seq.item 1
let duplicates = System.Linq.Enumerable.Intersect(s1, s2) |> Seq.isEmpty |> not

Understanding Mutability in F# : case study

I'm a beginner in F#, and this is my first attempt at programming something serious. I'm sorry the code is a bit long, but there are some issues with mutability that I don't understand.
This is an implementation of the Karger MinCut Algorithm to calculate the mincut on a non-directed graph component. I won't discuss here how the algo works,
for more info https://en.wikipedia.org/wiki/Karger%27s_algorithm
What is important is it's a randomized algorithm, which is running a determined number of trial runs, and taking the "best" run.
I realize now that I could avoid a lot of the problems below if I did construct a specific function for each random trial, but I'd like to understand EXACTLY what is wrong in the implementation below.
I'm running the code on this simple graph (the mincut is 2 when we cut the graph
into 2 components (1,2,3,4) and (5,6,7,8) with only 2 edges between those 2 components)
3--4-----5--6
|\/| |\/|
|/\| |/\|
2--1-----7--8
the file simplegraph.txt should encode this graph as follow
(1st column = node number, other columns = links)
1 2 3 4 7
2 1 3 4
3 1 2 4
4 1 2 3 5
5 4 6 7 8
6 5 7 8
7 1 5 6 8
8 5 6 7
This code may look too much as imperative programming yet, I'm sorry for that.
So There is a main for i loop calling each trial.
the first execution, (when i=1) looks smooth and perfect,
but I have runtime error execution when i=2, because it looks some variables,
like WG are not reinitialized correctly, causing out of bound errors.
WG, WG1 and WGmin are type wgraphobj, which are a record of Dictionary objects
WG1 is defined outside the main loop and i make no new assignments to WG1.
[but its type is mutable though, alas]
I defined first WG with the instruction
let mutable WG = WG1
then at the beginning of the for i loop,
i write
WG <- WG1
and then later, i modify the WG object in each trial to make some calculations.
when the trial is finished and we go to the next trial (i is increased) i want to reset WG to its initial state being like WG1.
but it seems its not working, and I don't get why...
Here is the full code
MyModule.fs [some functions not necessary for execution]
namespace MyModule
module Dict =
open System.Collections.Generic
let toSeq d = d |> Seq.map (fun (KeyValue(k,v)) -> (k,v))
let toArray (d:IDictionary<_,_>) = d |> toSeq |> Seq.toArray
let toList (d:IDictionary<_,_>) = d |> toSeq |> Seq.toList
let ofMap (m:Map<'k,'v>) = new Dictionary<'k,'v>(m) :> IDictionary<'k,'v>
let ofList (l:('k * 'v) list) = new Dictionary<'k,'v>(l |> Map.ofList) :> IDictionary<'k,'v>
let ofSeq (s:('k * 'v) seq) = new Dictionary<'k,'v>(s |> Map.ofSeq) :> IDictionary<'k,'v>
let ofArray (a:('k * 'v) []) = new Dictionary<'k,'v>(a |> Map.ofArray) :> IDictionary<'k,'v>
Karger.fs
open MyModule.Dict
open System.IO
let x = File.ReadAllLines "\..\simplegraph.txt";;
// val x : string [] =
let splitAtTab (text:string)=
text.Split [|'\t';' '|]
let splitIntoKeyValue (s:seq<'T>) =
(Seq.head s, Seq.tail s)
let parseLine (line:string)=
line
|> splitAtTab
|> Array.filter (fun s -> not(s=""))
|> Array.map (fun s-> (int s))
|> Array.toSeq
|> splitIntoKeyValue
let y =
x |> Array.map parseLine
open System.Collections.Generic
// let graph = new Map <int, int array>
let graphD = new Dictionary<int,int seq>()
y |> Array.iter graphD.Add
let graphM = y |> Map.ofArray //immutable
let N = y.Length // number of nodes
let Nruns = 2
let remove_table = new Dictionary<int,bool>()
[for i in 1..N do yield (i,false)] |> List.iter remove_table.Add
// let remove_table = seq [|for a in 1 ..N -> false|] // plus court
let label_head_table = new Dictionary<int,int>()
[for i in 1..N do yield (i,i)] |> List.iter label_head_table.Add
let label = new Dictionary<int,int seq>()
[for i in 1..N do yield (i,[i])] |> List.iter label.Add
let mutable min_cut = 1000000
type wgraphobj =
{ Graph : Dictionary<int,int seq>
RemoveTable : Dictionary<int,bool>
Label : Dictionary<int,int seq>
LabelHead : Dictionary<int,int> }
let WG1 = {Graph = graphD;
RemoveTable = remove_table;
Label = label;
LabelHead = label_head_table}
let mutable WGmin = WG1
let IsNotRemoved x = //
match x with
| (i,false) -> true
| (i,true) -> false
let IsNotRemoved1 WG i = //
(i,WG.RemoveTable.[i]) |>IsNotRemoved
let GetLiveNode d =
let myfun x =
match x with
| (i,b) -> i
d |> toList |> List.filter IsNotRemoved |> List.map myfun
let rand = System.Random()
// subsets a dictionary given a sub_list of keys
let D_Subset (dict:Dictionary<'T,'U>) (sub_list:list<'T>) =
let z = Dictionary<'T,'U>() // create new empty dictionary
sub_list |> List.filter (fun k -> dict.ContainsKey k)
|> List.map (fun k -> (k, dict.[k]))
|> List.iter (fun s -> z.Add s)
z
// subsets a dictionary given a sub_list of keys to remove
let D_SubsetC (dict:Dictionary<'T,'U>) (sub_list:list<'T>) =
let z = dict
sub_list |> List.filter (fun k -> dict.ContainsKey k)
|> List.map (fun k -> (dict.Remove k)) |>ignore
z
// subsets a sequence by values in a sequence
let S_Subset (S:seq<'T>)(sub_list:seq<'T>) =
S |> Seq.filter (fun s-> Seq.exists (fun elem -> elem = s) sub_list)
let S_SubsetC (S:seq<'T>)(sub_list:seq<'T>) =
S |> Seq.filter (fun s-> not(Seq.exists (fun elem -> elem = s) sub_list))
[<EntryPoint>]
let main argv =
let mutable u = 0
let mutable v = 0
let mutable r = 0
let mutable N_cut = 1000000
let mutable cluster_A_min = seq [0]
let mutable cluster_B_min = seq [0]
let mutable WG = WG1
let mutable LiveNodeList = [0]
// when i = 2, i encounter problems with mutability
for i in 1 .. Nruns do
WG <- WG1
printfn "%d" i
for k in 1..(N-2) do
LiveNodeList <- GetLiveNode WG.RemoveTable
r <- rand.Next(0,N-k)
u <- LiveNodeList.[r] //selecting a live node
let uuu = WG.Graph.[u] |> Seq.map (fun s -> WG.LabelHead.[s] )
|> Seq.filter (IsNotRemoved1 WG)
|> Seq.distinct
let n_edge = uuu |> Seq.length
let x = rand.Next(1,n_edge)
let mutable ok = false //maybe we can take this out
while not(ok) do
// selecting the edge from node u
v <- WG.LabelHead.[Array.get (uuu |> Seq.toArray) (x-1)]
let vvv = WG.Graph.[v] |> Seq.map (fun s -> WG.LabelHead.[s] )
|> Seq.filter (IsNotRemoved1 WG)
|> Seq.distinct
let zzz = S_SubsetC (Seq.concat [uuu;vvv] |> Seq.distinct) [u;v]
WG.Graph.[u] <- zzz
let lab_u = WG.Label.[u]
let lab_v = WG.Label.[v]
WG.Label.[u] <- Seq.concat [lab_u;lab_v] |> Seq.distinct
if (k<N-1) then
WG.RemoveTable.[v]<-true
//updating Label_head for all members of Label.[v]
WG.LabelHead.[v]<- u
for j in WG.Label.[v] do
WG.LabelHead.[j]<- u
ok <- true
printfn "u= %d v=%d" u v
// end of for k in 1..(N-2)
// counting cuts
// u,v contain the 2 indexes of groupings
let cluster_A = WG.Label.[u]
let cluster_B = S_SubsetC (seq[for i in 1..N do yield i]) cluster_A // defined as complementary of A
// let WG2 = {Graph = D_Subset WG1.Graph (cluster_A |> Seq.toList)
// RemoveTable = remove_table
// Label = D_Subset WG1.Graph (cluster_A |> Seq.toList)
// LabelHead = label_head_table}
let cross_edge = // returns keyvalue pair (k,S')
let IsInCluster cluster (k,S) =
(k,S_Subset S cluster)
graphM |> toSeq |> Seq.map (IsInCluster cluster_B)
N_cut <-
cross_edge |> Seq.map (fun (k:int,v:int seq)-> Seq.length v)
|> Seq.sum
if (N_cut<min_cut) then
min_cut <- N_cut
WGmin <- WG
cluster_A_min <- cluster_A
cluster_B_min <- cluster_B
// end of for i in 1..Nruns
0 // return an integer exit code
Description of the algo: (i don't think its too essential to solve my problem)
at each trial, there are several steps. at each step, we merge 2 nodes into 1, (removing effectively 1) updating the graph. we do that 6 times until there are only 2 nodes left, which we define as 2 clusters, and we look at the number of cross edges between those 2 clusters. if we are "lucky" those 2 clusters would be (1,2,3,4) and (5,6,7,8) and find the right number of cuts.
at each step, the object WG is updated with the effects of merging 2 nodes
with only LiveNodes (the ones which are not eliminated as a result of merging 2 nodes) being perfectly kept up to date.
WG.Graph is the updated graph
WG.Label contains the labels of the nodes which have been merged into the current node
WG.LabelHead contains the label of the node into which that node has been merged
WG.RemoveTable says if the node has been removed or not.
Thanks in advance for anyone willing to take a look at it !
"It seems not working", because wgraphobj is a reference type, which is allocated on the stack, which means that when you're mutating the innards of WG, you're also mutating the innards of WG1, because they're the same innards.
This is precisely the kind of mess you get yourself into if you use mutable state. This is why people recommend to not use it. In particular, your use of mutable dictionaries undermines the robustness of your algorithm. I recommend using the F#'s own efficient immutable dictionary (called Map) instead.
Now, in response to your comment about WG.Graph <- GraphD giving compile error.
WG is mutable, but WG.Graph is not (but the contents of WG.Graph are again mutable). There is a difference, let me try to explain it.
WG is mutable in the sense that it points to some object of type wgraphobj, but you can make it, in the course of your program, to point at another object of the same type.
WG.Graph, on the other hand, is a field packed inside WG. It points to some object of type Dictionary<_,_>. And you cannot make it point to another object. You can create a different wgraphobj, in which the field Graph point to a different dictionary, but you cannot change where the field Graph of the original wgraphobj points.
In order to make the field Graph itself mutable, you can declare it as such:
type wgraphobj = {
mutable Graph: Dictionary<int, int seq>
...
Then you will be able to mutate that field:
WG.Graph <- GraphD
Note that in this case you do not need to declare the value WG itself as mutable.
However, it seems to me that for your purposes you can actually go the way of creating a new instance wgraphobj with the field Graph changed, and assigning it to the mutable reference WG:
WG.Graph <- { WG with Graph = GraphD }

Get a list of invalid drive letters

let private GetDrives = seq{
let all=System.IO.DriveInfo.GetDrives()
for d in all do
//if(d.IsReady && d.DriveType=System.IO.DriveType.Fixed) then
yield d
}
let valid={'A'..'Z'}
let rec SearchRegistryForInvalidDrive (start:RegistryKey) = seq{
let validDrives=GetDrives |> Seq.map (fun x -> x.Name.Substring(0,1))
let invalidDrives= Seq.toList validDrives |> List.filter(fun x-> not (List.exists2 x b)) //(List.exists is the wrong method I think, but it doesn't compile
I followed F#: Filter items found in one list from another list but could not apply it to my problem as both the solutions I see don't seem to compile. List.Contains doesn't exist (missing a reference?) and ListA - ListB doesn't compile either.
open System.IO
let driveLetters = set [ for d in DriveInfo.GetDrives() -> d.Name.[0] ]
let unused = set ['A'..'Z'] - driveLetters
Your first error is mixing between char and string, it is good to start with char:
let all = {'A'..'Z'}
let validDrives = GetDrives |> Seq.map (fun x -> x.Name.[0])
Now invalid drive letters are those letters which are in all but not in validDrives:
let invalidDrives =
all |> Seq.filter (fun c -> validDrives |> List.forall ((<>) c))
Since validDrives is traversed many times to check for membership, turning it to a set is better in this example:
let all = {'A'..'Z'}
let validDrives = GetDrives |> Seq.map (fun x -> x.Name.[0]) |> Set.ofSeq
let invalidDrives = all |> Seq.filter (not << validDrives.Contains)

Unpack tuple into another tuple

Suppose I need to construct a tuple of length three:
(x , y, z)
And I have a function which returns a tuple of length two - exampleFunction and the last two elements of the tuple to be constructed are from this tuple.
How can I do this without having to call the exampleFunction two times:
(x, fst exampleFunction , snd exampleFunction)
I just want to do / achieve something like
(x, exampleFunction)
but it complains that the tuples have unmatched length ( of course )
Not looking at doing let y,z = exampleFunction()
There may be a built in function, but a custom one would work just as well.
let repack (a,(b,c)) = (a,b,c)
repack (x,exampleFunction)
I'm not sure it if worth a separate answer, but both answers provided above are not optimal since both construct redundant Tuple<'a, Tuple<'b, 'c>> upon invocation of the helper function. I would say a custom operator would be better for both readability and performance:
let inline ( +# ) a (b,c) = a, b, c
let result = x +# yz // result is ('x, 'y, 'z)
The problem you have is that the function return a*b so the return type becomes 'a*('b*'c) which is different to 'a*'b*'c the best solution is a small helper function like
let inline flatten (a,(b,c)) = a,b,c
then you can do
(x,examplefunction) |> flatten
I have the following function in my common extension file.
You may find this useful.
let inline squash12 ((a,(b,c) ):('a*('b*'c) )):('a*'b*'c ) = (a,b,c )
let inline squash21 (((a,b),c ):(('a*'b)*'c )):('a*'b*'c ) = (a,b,c )
let inline squash13 ((a,(b,c,d)):('a*('b*'c*'d))):('a*'b*'c*'d) = (a,b,c,d)
let seqsquash12 (sa:seq<'a*('b*'c) >) = sa |> Seq.map squash12
let seqsquash21 (sa:seq<('a*'b)*'c >) = sa |> Seq.map squash21
let seqsquash13 (sa:seq<'a*('b*'c*'d)>) = sa |> Seq.map squash13
let arrsquash12 (sa:('a*('b*'c) ) array) = sa |> Array.map squash12
let arrsquash21 (sa:(('a*'b)*'c ) array) = sa |> Array.map squash21
let arrsquash13 (sa:('a*('b*'c*'d)) array) = sa |> Array.map squash13

Resources