F# Finding distinct elements in list of tuples - f#

I have a list of tuples:
(string * (int * int)) list
let st = [("a1",(100,10)); ("a2",(50,20)); ("a3",(25,40))]
Where I want to make a function which returns a bool on a few conditions: Both the ints have to be more than 0 and the strings in the list have to be distinct.
So far I have:
let rec inv st = List.forall (fun (a,(n,p)) -> n>0 && p>0) st
But I'm having trouble figuring out how to find out if all the strings in the list are distinct. Any hints?

Use distinctBy:
let inv st =
List.length (List.distinctBy fst st) = List.length st && List.forall (fun (a,(n,p)) -> n>0 && p>0) st
or you can combine both checks in a single pipeline:
let inv st =
st
|> List.filter (fun (_,(n,p)) -> n>0 && p>0)
|> List.distinctBy fst
|> List.length
|> (=) (List.length st)

The shortest way would be to just compile the list of all distinct strings (via List.distinct) and see if it ended up the same size as the original list:
let allDistinctStirngs = st |> List.map fst |> List.distinct
let allStringsAreDistinct = List.length st = List.length allDistinctStrings

Related

F# list group by running total?

I have the following list of tuples ordered by the first item. I want to cluster the times by
If the second item of the tuple is greater then 50, it will be in its own cluster.
Otherwise, cluster the items whose sum is less than 50.
The order cannot be changed.
code:
let values =
[("ACE", 78);
("AMR", 3);
("Aam", 6);
("Acc", 1);
("Adj", 23);
("Aga", 12);
("All", 2);
("Ame", 4);
("Amo", 60);
//....
]
values |> Seq.groupBy(fun (k,v) -> ???)
The expected value will be
[["ACE"] // 78
["AMR"; "Aam"; "Acc"; "Adj"; "Aga"; "All"] // 47
["Ame"] // 4
["Amo"] // 60
....]
Ideally, I want to evenly distribute the second group (["AMR"; "Aam"; "Acc"; "Adj"; "Aga"; "All"] which got the sum of 47) and the third one (["Ame"] which has only 4).
How to implement it in F#?
I had the following solution. It uses a mutable variable. It's not F# idiomatic? Is for ... do imperative in F# or is it a syntactic sugar of some function construct?
seq {
let mutable c = []
for v in values |> Seq.sortBy(fun (k, _) -> k) do
let sum = c |> Seq.map(fun (_, v) -> v) |> Seq.sum
if not(c = []) && sum + (snd v) > 50
then
yield c
c <- [v]
else
c <- List.append c [v]
}
I think I got it. Not the nicest code ever, but works and is immutable.
let foldFn (acc:(string list * int) list) (name, value) =
let addToLast last =
let withoutLast = acc |> List.filter ((<>) last)
let newLast = [((fst last) # [name]), (snd last) + value]
newLast |> List.append withoutLast
match acc |> List.tryLast with
| None -> [[name],value]
| Some l ->
if (snd l) + value <= 50 then addToLast l
else [[name], value] |> List.append acc
values |> List.fold foldFn [] |> List.map fst
Update: Since append can be quite expensive operation, I added prepend only version (still fulfills original requirement to keep order).
let foldFn (acc:(string list * int) list) (name, value) =
let addToLast last =
let withoutLast = acc |> List.filter ((<>) last) |> List.rev
let newLast = ((fst last) # [name]), (snd last) + value
(newLast :: withoutLast) |> List.rev
match acc |> List.tryLast with
| None -> [[name],value]
| Some l ->
if (snd l) + value <= 50 then addToLast l
else ([name], value) :: (List.rev acc) |> List.rev
Note: There is still # operator on line 4 (when creating new list of names in cluster), but since the theoretical maximum amount of names in cluster is 50 (if all of them would be equal 1), the performance here is negligible.
If you remove List.map fst on last line, you would get sum value for each cluster in list.
Append operations are expensive. A straight-forward fold with prepended intermediate results is cheaper, even if the lists need to be reversed after processing.
["ACE", 78; "AMR", 3; "Aam", 6; "Acc", 1; "Adj", 23; "Aga", 12; "All", 2; "Ame", 4; "Amd", 6; "Amo", 60]
|> List.fold (fun (r, s1, s2) (t1, t2) ->
if t2 > 50 then [t1]::s1::r, [], 0
elif s2 + t2 > 50 then s1::r, [t1], t2
else r, t1::s1, s2 + t2 ) ([], [], 0)
|> fun (r, s1, _) -> s1::r
|> List.filter (not << List.isEmpty)
|> List.map List.rev
|> List.rev
// val it : string list list =
// [["ACE"]; ["AMR"; "Aam"; "Acc"; "Adj"; "Aga"; "All"]; ["Ame"; "Amd"];
// ["Amo"]]
Here is a recursive version - working much the same way as fold-versions:
let groupBySums data =
let rec group cur sum acc lst =
match lst with
| [] -> acc |> List.where (not << List.isEmpty) |> List.rev
| (name, value)::tail when value > 50 -> group [] 0 ([(name, value)]::(cur |> List.rev)::acc) tail
| (name, value)::tail ->
match sum + value with
| x when x > 50 -> group [(name, value)] 0 ((cur |> List.rev)::acc) tail
| _ -> group ((name, value)::cur) (sum + value) acc tail
(data |> List.sortBy (fun (name, _) -> name)) |> group [] 0 []
values |> groupBySums |> List.iter (printfn "%A")
Yet another solution using Seq.mapFold and Seq.groupBy:
let group values =
values
|> Seq.mapFold (fun (group, total) (name, count) ->
let newTotal = count + total
let newGroup = group + if newTotal > 50 then 1 else 0
(newGroup, name), (newGroup, if newGroup = group then newTotal else count)
) (0, 0)
|> fst
|> Seq.groupBy fst
|> Seq.map (snd >> Seq.map snd >> Seq.toList)
Invoke it like this:
[ "ACE", 78
"AMR", 3
"Aam", 6
"Acc", 1
"Adj", 23
"Aga", 12
"All", 2
"Ame", 4
"Amo", 60
]
|> group
|> Seq.iter (printfn "%A")
// ["ACE"]
// ["AMR"; "Aam"; "Acc"; "Adj"; "Aga"; "All"]
// ["Ame"]
// ["Amo"]

How to split a sequence in F# based on another sequence in an idiomatic way

I have, in F#, 2 sequences, each containing distinct integers, strictly in ascending order: listMaxes and numbers.
If not Seq.isEmpty numbers, then it is guaranteed that not Seq.isEmpty listMaxes and Seq.last listMaxes >= Seq.last numbers.
I would like to implement in F# a function that returns a list of list of integers, whose List.length equals Seq.length listMaxes, containing the elements of numbers divided in lists, where the elements of listMaxes limit each group.
For example: called with the arguments
listMaxes = seq [ 25; 56; 65; 75; 88 ]
numbers = seq [ 10; 11; 13; 16; 20; 25; 31; 38; 46; 55; 65; 76; 88 ]
this function should return
[ [10; 11; 13; 16; 20; 25]; [31; 38; 46; 55]; [65]; List.empty; [76; 88] ]
I can implement this function, iterating over numbers only once:
let groupByListMaxes listMaxes numbers =
if Seq.isEmpty numbers then
List.replicate (Seq.length listMaxes) List.empty
else
List.ofSeq (seq {
use nbe = numbers.GetEnumerator ()
ignore (nbe.MoveNext ())
for lmax in listMaxes do
yield List.ofSeq (seq {
if nbe.Current <= lmax then
yield nbe.Current
while nbe.MoveNext () && nbe.Current <= lmax do
yield nbe.Current
})
})
But this code feels unclean, ugly, imperative, and very un-F#-y.
Is there any functional / F#-idiomatic way to achieve this?
Here's a version based on list interpretation, which is quite functional in style. You can use Seq.toList to convert between them, whenever you want to handle that. You could also use Seq.scan in conjunction with Seq.partition ((>=) max) if you want to use only library functions, but beware that it's very very easy to introduce a quadratic complexity in either computation or memory when doing that.
This is linear in both:
let splitAt value lst =
let rec loop l1 = function
| [] -> List.rev l1, []
| h :: t when h > value -> List.rev l1, (h :: t)
| h :: t -> loop (h :: l1) t
loop [] lst
let groupByListMaxes listMaxes numbers =
let rec loop acc lst = function
| [] -> List.rev acc
| h :: t ->
let out, lst' = splitAt h lst
loop (out :: acc) lst' t
loop [] numbers listMaxes
It can be done like this with pattern matching and tail recursion:
let groupByListMaxes listMaxes numbers =
let rec inner acc numbers =
function
| [] -> acc |> List.rev
| max::tail ->
let taken = numbers |> Seq.takeWhile ((>=) max) |> List.ofSeq
let n = taken |> List.length
inner (taken::acc) (numbers |> Seq.skip n) tail
inner [] numbers (listMaxes |> List.ofSeq)
Update: I also got inspired by fold and came up with the following solution that strictly refrains from converting the input sequences.
let groupByListMaxes maxes numbers =
let rec inner (acc, (cur, numbers)) max =
match numbers |> Seq.tryHead with
// Add n to the current list of n's less
// than the local max
| Some n when n <= max ->
let remaining = numbers |> Seq.tail
inner (acc, (n::cur, remaining)) max
// Complete the current list by adding it
// to the accumulated result and prepare
// the next list for fold.
| _ ->
(List.rev cur)::acc, ([], numbers)
maxes |> Seq.fold inner ([], ([], numbers)) |> fst |> List.rev
I have found a better implementation myself. Tips for improvements are still welcome.
Dealing with 2 sequences is really a pain. And I really do want to iterate over numbers only once without turning that sequence into a list. But then I realized that turning listMaxes (generally the shorter of the sequences) is less costly. That way only 1 sequence remains, and I can use Seq.fold over numbers.
What should be the state that we want to keep and change while iterating with Seq.fold over numbers? First, it should definitely include the remaining of the listMaxes, yet the previous maxes that we already have surpassed are no longer of interest. Second, the accumulated lists so far, although, like in the other answers, these can be kept in reverse order. More to the point: the state is a couple which has as second element a reversed list of reversed lists of the numbers so far.
let groupByListMaxes listMaxes numbers =
let rec folder state number =
match state with
| m :: maxes, _ when number > m ->
folder (maxes, List.empty :: snd state) number
| m :: maxes, [] ->
fst state, List.singleton (List.singleton number)
| m :: maxes, h :: t ->
fst state, (number :: h) :: t
| [], _ ->
failwith "Guaranteed not to happen"
let listMaxesList = List.ofSeq listMaxes
let initialState = listMaxesList, List.empty
let reversed = snd (Seq.fold folder initialState numbers)
let temp = List.rev (List.map List.rev reversed)
let extraLength = List.length listMaxesList - List.length temp
let extra = List.replicate extraLength List.empty
List.concat [temp; extra]
I know this is an old question but I had a very similar problem and I think this is a simple solution:
let groupByListMaxes cs xs =
List.scan (fun (_, xs) c -> List.partition (fun x -> x <= c) xs)
([], xs)
cs
|> List.skip 1
|> List.map fst

How do I write a ZipN-like function in F#?

I want to create a function with the signature seq<#seq<'a>> ->seq<seq<'a>> that acts like a Zip method taking a sequence of an arbitrary number of input sequences (instead of 2 or 3 as in Zip2 and Zip3) and returning a sequence of sequences instead of tuples as a result.
That is, given the following input:
[[1;2;3];
[4;5;6];
[7;8;9]]
it will return the result:
[[1;4;7];
[2;5;8];
[3;6;9]]
except with sequences instead of lists.
I am very new to F#, but I have created a function that does what I want, but I know it can be improved. It's not tail recursive and it seems like it could be simpler, but I don't know how yet. I also haven't found a good way to get the signature the way I want (accepting, e.g., an int list list as input) without a second function.
I know this could be implemented using enumerators directly, but I'm interested in doing it in a functional manner.
Here's my code:
let private Tail seq = Seq.skip 1 seq
let private HasLengthNoMoreThan n = Seq.skip n >> Seq.isEmpty
let rec ZipN_core = function
| seqs when seqs |> Seq.isEmpty -> Seq.empty
| seqs when seqs |> Seq.exists Seq.isEmpty -> Seq.empty
| seqs ->
let head = seqs |> Seq.map Seq.head
let tail = seqs |> Seq.map Tail |> ZipN_core
Seq.append (Seq.singleton head) tail
// Required to change the signature of the parameter from seq<seq<'a> to seq<#seq<'a>>
let ZipN seqs = seqs |> Seq.map (fun x -> x |> Seq.map (fun y -> y)) |> ZipN_core
let zipn items = items |> Matrix.Generic.ofSeq |> Matrix.Generic.transpose
Or, if you really want to write it yourself:
let zipn items =
let rec loop items =
seq {
match items with
| [] -> ()
| _ ->
match zipOne ([], []) items with
| Some(xs, rest) ->
yield xs
yield! loop rest
| None -> ()
}
and zipOne (acc, rest) = function
| [] -> Some(List.rev acc, List.rev rest)
| []::_ -> None
| (x::xs)::ys -> zipOne (x::acc, xs::rest) ys
loop items
Since this seems to be the canonical answer for writing a zipn in f#, I wanted to add a "pure" seq solution that preserves laziness and doesn't force us to load our full source sequences in memory at once like the Matrix.transpose function. There are scenarios where this is very important because it's a) faster and b) works with sequences that contain 100s of MBs of data!
This is probably the most un-idiomatic f# code I've written in a while but it gets the job done (and hey, why would there be sequence expressions in f# if you couldn't use them for writing procedural code in a functional language).
let seqdata = seq {
yield Seq.ofList [ 1; 2; 3 ]
yield Seq.ofList [ 4; 5; 6 ]
yield Seq.ofList [ 7; 8; 9 ]
}
let zipnSeq (src:seq<seq<'a>>) = seq {
let enumerators = src |> Seq.map (fun x -> x.GetEnumerator()) |> Seq.toArray
if (enumerators.Length > 0) then
try
while(enumerators |> Array.forall(fun x -> x.MoveNext())) do
yield enumerators |> Array.map( fun x -> x.Current)
finally
enumerators |> Array.iter (fun x -> x.Dispose())
}
zipnSeq seqdata |> Seq.toArray
val it : int [] [] = [|[|1; 4; 7|]; [|2; 5; 8|]; [|3; 6; 9|]|]
By the way, the traditional matrix transpose is much more terse than #Daniel's answer. Though, it requires a list or LazyList that both will eventually have the full sequence in memory.
let rec transpose =
function
| (_ :: _) :: _ as M -> List.map List.head M :: transpose (List.map List.tail M)
| _ -> []
To handle having sub-lists of different lengths, I've used option types to spot if we've run out of elements.
let split = function
| [] -> None, []
| h::t -> Some(h), t
let rec zipN listOfLists =
seq { let splitted = listOfLists |> List.map split
let anyMore = splitted |> Seq.exists (fun (f, _) -> f.IsSome)
if anyMore then
yield splitted |> List.map fst
let rest = splitted |> List.map snd
yield! rest |> zipN }
This would map
let ll = [ [ 1; 2; 3 ];
[ 4; 5; 6 ];
[ 7; 8; 9 ] ]
to
seq
[seq [Some 1; Some 4; Some 7]; seq [Some 2; Some 5; Some 8];
seq [Some 3; Some 6; Some 9]]
and
let ll = [ [ 1; 2; 3 ];
[ 4; 5; 6 ];
[ 7; 8 ] ]
to
seq
[seq [Some 1; Some 4; Some 7]; seq [Some 2; Some 5; Some 8];
seq [Some 3; Some 6; null]]
This takes a different approach to yours, but avoids using some of the operations that you had before (e.g. Seq.skip, Seq.append), which you should be careful with.
I realize that this answer is not very efficient, but I do like its succinctness:
[[1;2;3]; [4;5;6]; [7;8;9]]
|> Seq.collect Seq.indexed
|> Seq.groupBy fst
|> Seq.map (snd >> Seq.map snd);;
Another option:
let zipN ls =
let rec loop (a,b) =
match b with
|l when List.head l = [] -> a
|l ->
let x1,x2 =
(([],[]),l)
||> List.fold (fun acc elem ->
match acc,elem with
|(ah,at),eh::et -> ah#[eh],at#[et]
|_ -> acc)
loop (a#[x1],x2)
loop ([],ls)

Split seq in F#

I should split seq<a> into seq<seq<a>> by an attribute of the elements. If this attribute equals by a given value it must be 'splitted' at that point. How can I do that in FSharp?
It should be nice to pass a 'function' to it that returns a bool if must be splitted at that item or no.
Sample:
Input sequence: seq: {1,2,3,4,1,5,6,7,1,9}
It should be splitted at every items when it equals 1, so the result should be:
seq
{
seq{1,2,3,4}
seq{1,5,6,7}
seq{1,9}
}
All you're really doing is grouping--creating a new group each time a value is encountered.
let splitBy f input =
let i = ref 0
input
|> Seq.map (fun x ->
if f x then incr i
!i, x)
|> Seq.groupBy fst
|> Seq.map (fun (_, b) -> Seq.map snd b)
Example
let items = seq [1;2;3;4;1;5;6;7;1;9]
items |> splitBy ((=) 1)
Again, shorter, with Stephen's nice improvements:
let splitBy f input =
let i = ref 0
input
|> Seq.groupBy (fun x ->
if f x then incr i
!i)
|> Seq.map snd
Unfortunately, writing functions that work with sequences (the seq<'T> type) is a bit difficult. They do not nicely work with functional concepts like pattern matching on lists. Instead, you have to use the GetEnumerator method and the resulting IEnumerator<'T> type. This often makes the code quite imperative. In this case, I'd write the following:
let splitUsing special (input:seq<_>) = seq {
use en = input.GetEnumerator()
let finished = ref false
let start = ref true
let rec taking () = seq {
if not (en.MoveNext()) then finished := true
elif en.Current = special then start := true
else
yield en.Current
yield! taking() }
yield taking()
while not (!finished) do
yield Seq.concat [ Seq.singleton special; taking()] }
I wouldn't recommend using the functional style (e.g. using Seq.skip and Seq.head), because this is quite inefficient - it creates a chain of sequences that take value from other sequence and just return it (so there is usually O(N^2) complexity).
Alternatively, you could write this using a computation builder for working with IEnumerator<'T>, but that's not standard. You can find it here, if you want to play with it.
The following is an impure implementation but yields immutable sequences lazily:
let unflatten f s = seq {
let buffer = ResizeArray()
let flush() = seq {
if buffer.Count > 0 then
yield Seq.readonly (buffer.ToArray())
buffer.Clear() }
for item in s do
if f item then yield! flush()
buffer.Add(item)
yield! flush() }
f is the function used to test whether an element should be a split point:
[1;2;3;4;1;5;6;7;1;9] |> unflatten (fun item -> item = 1)
Probably no the most efficient solution, but this works:
let takeAndSkipWhile f s = Seq.takeWhile f s, Seq.skipWhile f s
let takeAndSkipUntil f = takeAndSkipWhile (f >> not)
let rec splitOn f s =
if Seq.isEmpty s then
Seq.empty
else
let pre, post =
if f (Seq.head s) then
takeAndSkipUntil f (Seq.skip 1 s)
|> fun (a, b) ->
Seq.append [Seq.head s] a, b
else
takeAndSkipUntil f s
if Seq.isEmpty pre then
Seq.singleton post
else
Seq.append [pre] (splitOn f post)
splitOn ((=) 1) [1;2;3;4;1;5;6;7;1;9] // int list is compatible with seq<int>
The type of splitOn is ('a -> bool) -> seq<'a> -> seq>. I haven't tested it on many inputs, but it seems to work.
In case you are looking for something which actually works like split as an string split (i.e the item is not included on which the predicate returns true) the below is what I came up with.. tried to be as functional as possible :)
let fromEnum (input : 'a IEnumerator) =
seq {
while input.MoveNext() do
yield input.Current
}
let getMore (input : 'a IEnumerator) =
if input.MoveNext() = false then None
else Some ((input |> fromEnum) |> Seq.append [input.Current])
let splitBy (f : 'a -> bool) (input : 'a seq) =
use s = input.GetEnumerator()
let rec loop (acc : 'a seq seq) =
match s |> getMore with
| None -> acc
| Some x ->[x |> Seq.takeWhile (f >> not) |> Seq.toList |> List.toSeq]
|> Seq.append acc
|> loop
loop Seq.empty |> Seq.filter (Seq.isEmpty >> not)
seq [1;2;3;4;1;5;6;7;1;9;5;5;1]
|> splitBy ( (=) 1) |> printfn "%A"

f# array.filter based on a bool array

if I have array A, and I have another bool array isChosen with the same length of A how can I build a new array from A where isChosen is true? something like A.[isChosen]? I cannot use Array.filter directly since isChosen is not a function of A elements and there is no Array.filteri like Array.mapi.
zip should help:
let l = [|1;2;3|]
let f = [|true; false; true|]
let r = [| for (v, f) in Seq.zip l f do if f then yield v|]
// or
let r = (l, f) ||> Seq.zip |> Seq.filter snd |> Seq.map fst |> Seq.toArray
Try the zip operator
seq.zip A isChosen
|> Seq.filter snd
|> Seq.map fst
|> Array.ofSeq
This will create a sequence of tuples where one value is from A and the other is from isChosen. This will pair the values together and make it very easy to filter them out in a Seq.filter expression
It's not as elegant or 'functional' as the other answers, but every once in a while I like a gentle reminder that you can use loops and array indices in F#:
let A = [|1;2;3|]
let isChosen = [|true; false; true|]
let r = [| for i in 0..A.Length-1 do
if isChosen.[i] then
yield A.[i] |]
printfn "%A" r
:)
And here are two more ways, just to demonstrate (even) more F# library functions:
let A = [|1;2;3|]
let isChosen = [|true;false;true|]
let B = Seq.map2 (fun x b -> if b then Some x else None) A isChosen
|> Seq.choose id
|> Seq.toArray
let C = Array.foldBack2 (fun x b acc -> if b then x::acc else acc) A isChosen []
|> List.toArray
My personal favorite for understandability (and therefore maintainability): desco's answer
let r = [| for (v, f) in Seq.zip l f do if f then yield v|]

Resources