How do I write a ZipN-like function in F#? - f#

I want to create a function with the signature seq<#seq<'a>> ->seq<seq<'a>> that acts like a Zip method taking a sequence of an arbitrary number of input sequences (instead of 2 or 3 as in Zip2 and Zip3) and returning a sequence of sequences instead of tuples as a result.
That is, given the following input:
[[1;2;3];
[4;5;6];
[7;8;9]]
it will return the result:
[[1;4;7];
[2;5;8];
[3;6;9]]
except with sequences instead of lists.
I am very new to F#, but I have created a function that does what I want, but I know it can be improved. It's not tail recursive and it seems like it could be simpler, but I don't know how yet. I also haven't found a good way to get the signature the way I want (accepting, e.g., an int list list as input) without a second function.
I know this could be implemented using enumerators directly, but I'm interested in doing it in a functional manner.
Here's my code:
let private Tail seq = Seq.skip 1 seq
let private HasLengthNoMoreThan n = Seq.skip n >> Seq.isEmpty
let rec ZipN_core = function
| seqs when seqs |> Seq.isEmpty -> Seq.empty
| seqs when seqs |> Seq.exists Seq.isEmpty -> Seq.empty
| seqs ->
let head = seqs |> Seq.map Seq.head
let tail = seqs |> Seq.map Tail |> ZipN_core
Seq.append (Seq.singleton head) tail
// Required to change the signature of the parameter from seq<seq<'a> to seq<#seq<'a>>
let ZipN seqs = seqs |> Seq.map (fun x -> x |> Seq.map (fun y -> y)) |> ZipN_core

let zipn items = items |> Matrix.Generic.ofSeq |> Matrix.Generic.transpose
Or, if you really want to write it yourself:
let zipn items =
let rec loop items =
seq {
match items with
| [] -> ()
| _ ->
match zipOne ([], []) items with
| Some(xs, rest) ->
yield xs
yield! loop rest
| None -> ()
}
and zipOne (acc, rest) = function
| [] -> Some(List.rev acc, List.rev rest)
| []::_ -> None
| (x::xs)::ys -> zipOne (x::acc, xs::rest) ys
loop items

Since this seems to be the canonical answer for writing a zipn in f#, I wanted to add a "pure" seq solution that preserves laziness and doesn't force us to load our full source sequences in memory at once like the Matrix.transpose function. There are scenarios where this is very important because it's a) faster and b) works with sequences that contain 100s of MBs of data!
This is probably the most un-idiomatic f# code I've written in a while but it gets the job done (and hey, why would there be sequence expressions in f# if you couldn't use them for writing procedural code in a functional language).
let seqdata = seq {
yield Seq.ofList [ 1; 2; 3 ]
yield Seq.ofList [ 4; 5; 6 ]
yield Seq.ofList [ 7; 8; 9 ]
}
let zipnSeq (src:seq<seq<'a>>) = seq {
let enumerators = src |> Seq.map (fun x -> x.GetEnumerator()) |> Seq.toArray
if (enumerators.Length > 0) then
try
while(enumerators |> Array.forall(fun x -> x.MoveNext())) do
yield enumerators |> Array.map( fun x -> x.Current)
finally
enumerators |> Array.iter (fun x -> x.Dispose())
}
zipnSeq seqdata |> Seq.toArray
val it : int [] [] = [|[|1; 4; 7|]; [|2; 5; 8|]; [|3; 6; 9|]|]
By the way, the traditional matrix transpose is much more terse than #Daniel's answer. Though, it requires a list or LazyList that both will eventually have the full sequence in memory.
let rec transpose =
function
| (_ :: _) :: _ as M -> List.map List.head M :: transpose (List.map List.tail M)
| _ -> []

To handle having sub-lists of different lengths, I've used option types to spot if we've run out of elements.
let split = function
| [] -> None, []
| h::t -> Some(h), t
let rec zipN listOfLists =
seq { let splitted = listOfLists |> List.map split
let anyMore = splitted |> Seq.exists (fun (f, _) -> f.IsSome)
if anyMore then
yield splitted |> List.map fst
let rest = splitted |> List.map snd
yield! rest |> zipN }
This would map
let ll = [ [ 1; 2; 3 ];
[ 4; 5; 6 ];
[ 7; 8; 9 ] ]
to
seq
[seq [Some 1; Some 4; Some 7]; seq [Some 2; Some 5; Some 8];
seq [Some 3; Some 6; Some 9]]
and
let ll = [ [ 1; 2; 3 ];
[ 4; 5; 6 ];
[ 7; 8 ] ]
to
seq
[seq [Some 1; Some 4; Some 7]; seq [Some 2; Some 5; Some 8];
seq [Some 3; Some 6; null]]
This takes a different approach to yours, but avoids using some of the operations that you had before (e.g. Seq.skip, Seq.append), which you should be careful with.

I realize that this answer is not very efficient, but I do like its succinctness:
[[1;2;3]; [4;5;6]; [7;8;9]]
|> Seq.collect Seq.indexed
|> Seq.groupBy fst
|> Seq.map (snd >> Seq.map snd);;

Another option:
let zipN ls =
let rec loop (a,b) =
match b with
|l when List.head l = [] -> a
|l ->
let x1,x2 =
(([],[]),l)
||> List.fold (fun acc elem ->
match acc,elem with
|(ah,at),eh::et -> ah#[eh],at#[et]
|_ -> acc)
loop (a#[x1],x2)
loop ([],ls)

Related

F# get set of subsets containing k elements

Given a set with n elements {1, 2, 3, ..., n}, I want to declare a function which returns the set containing the sets with k number of elements such as:
allSubsets 3 2
Would return [[1;2];[1;3];[2;3]] since those are the sets with 2 elements in a set created by 1 .. n
I've made the initial create-a-set-part but I'm a little stuck on how to find out all the subsets with k elements in it.
let allSubsets n k =
Set.ofList [1..n] |>
UPDATE:
I managed to get a working solution using yield:
let allSubsets n k =
let setN = Set.ofList [1..n]
let rec subsets s =
set [
if Set.count s = k then yield s
for e in s do
yield! subsets (Set.remove e s) ]
subsets setN
allSubsets 3 2
val it : Set<Set<int>> = set [set [1; 2]; set [1; 3]; set [2; 3]]
But isn't it possible to do it a little cleaner?
What you have is pretty clean, but it's also pretty inefficient. Try running allSubsets 10 3 and you'll know what I mean.
This is what I came up with:
let input = Set.ofList [ 1 .. 15 ]
let subsets (size:int) (input: Set<'a>) =
let rec inner elems =
match elems with
| [] -> [[]]
| h::t ->
List.fold (fun acc e ->
if List.length e < size then
(h::e)::e::acc
else e::acc) [] (inner t)
inner (Set.toList input)
|> Seq.choose (fun subset ->
if List.length subset = size then
Some <| Set.ofList subset
else None)
|> Set.ofSeq
subsets 3 input
The inner recursive function is a modified power set function from here. My first hunch was to generate the power set and then filter it, which would be pretty elegant, but that proved to be rather inefficient as well.
If this was to be production-quality code, I'd look into generating lists of indices of a given length, and use them to index into the input array. This is how FsCheck generates subsets, for example.
You can calculate the powerset and then filter in order to get only the ones with the specified length":
let powerset n k =
let lst = Set.toList n
seq [0..(lst.Length |> pown 2)-1]
|> Seq.map (fun i ->
set ([0..lst.Length-1] |> Seq.choose (fun x ->
if i &&& (pown 2 x) = 0 then None else Some lst.[x])))
|> Seq.filter (Seq.length >> (=) k)
However this is not efficient for large sets (n) of where k is close to n. But it's easy to optimize, you'll have to filter out early based on the digit count of the binary representation of each number.
This function implements the popular n-choose-k function:
let n_choose_k (arr: 'a []) (k: int) : 'a list list =
let len = Array.length arr
let rec choose lo x =
match x with
| 0 -> [[]]
| i -> [ for j in lo..(len-1) do
for ks in choose (j+1) (i-1) do
yield arr.[j]::ks ]
choose 0 k
> n_choose_k [|1..3|] 2;;
val it : int list list = [[1; 2]; [1; 3]; [2; 3]]
You can use Set.toArray and Set.ofList to convert to and from Set.
You can consider the following approach:
get powerset
let rec powerset xs =
match xs with
| [] -> [ [] ]
| h :: t -> List.fold (fun ys s -> (h :: s) :: s :: ys) [] (powerset t)
filter all subsets with a neccessary number of elements
let filtered xs k = List.filter (fun (x: 'a list) -> x.Length = k) xs
finally get the requested allSubsets
let allSubsets n k = Set.ofList (List.map (fun xs -> Set.ofList xs) (filtered (powerset [ 1 .. n ]) k))
Just to check and play with you can use:
printfn "%A" (allSubsets 3 2) // set [ set [1; 2]; set [1; 3]; set [2; 3] ]

Replicate list items n times in a F# sequence

I have a sequence in F#:
let n = 2
let seq1 = {
yield "a"
yield "b"
yield "c"
}
I want to print every item in the sequence n times. I can do it this way:
let printx line t =
for i = 1 to t do
printfn "%s" line
seq1 |> Seq.iter (fun i -> printx i n)
Output of this is:
a
a
b
b
c
c
I think this is not the best solution. How to replicate the items in the sequence?
You can create a function to replicate each element of an input sequence:
let replicateAll n s = s |> Seq.collect (fun e -> Seq.init n (fun _ -> e))
then
seq1 |> replicateAll 2 |> Seq.iter (printfn "%s")
I would rather go with a sequence computation expression.
Looks cleaner:
let replicateAll n xs = seq {
for x in xs do
for _ in 1..n do
yield x
}
There is actually a replicate function:
let xs = [1; 2; 3; 4; 5]
xs |> List.collect (fun x -> List.replicate 3 x)
//val it : int list = [1; 1; 1; 2; 2; 2; 3; 3; 3; 4; 4; 4; 5; 5; 5]
And you can do function composition on it, which will get rid of the lambda:
let repCol n xs = (List.replicate >> List.collect) n xs

How to split a sequence in F# based on another sequence in an idiomatic way

I have, in F#, 2 sequences, each containing distinct integers, strictly in ascending order: listMaxes and numbers.
If not Seq.isEmpty numbers, then it is guaranteed that not Seq.isEmpty listMaxes and Seq.last listMaxes >= Seq.last numbers.
I would like to implement in F# a function that returns a list of list of integers, whose List.length equals Seq.length listMaxes, containing the elements of numbers divided in lists, where the elements of listMaxes limit each group.
For example: called with the arguments
listMaxes = seq [ 25; 56; 65; 75; 88 ]
numbers = seq [ 10; 11; 13; 16; 20; 25; 31; 38; 46; 55; 65; 76; 88 ]
this function should return
[ [10; 11; 13; 16; 20; 25]; [31; 38; 46; 55]; [65]; List.empty; [76; 88] ]
I can implement this function, iterating over numbers only once:
let groupByListMaxes listMaxes numbers =
if Seq.isEmpty numbers then
List.replicate (Seq.length listMaxes) List.empty
else
List.ofSeq (seq {
use nbe = numbers.GetEnumerator ()
ignore (nbe.MoveNext ())
for lmax in listMaxes do
yield List.ofSeq (seq {
if nbe.Current <= lmax then
yield nbe.Current
while nbe.MoveNext () && nbe.Current <= lmax do
yield nbe.Current
})
})
But this code feels unclean, ugly, imperative, and very un-F#-y.
Is there any functional / F#-idiomatic way to achieve this?
Here's a version based on list interpretation, which is quite functional in style. You can use Seq.toList to convert between them, whenever you want to handle that. You could also use Seq.scan in conjunction with Seq.partition ((>=) max) if you want to use only library functions, but beware that it's very very easy to introduce a quadratic complexity in either computation or memory when doing that.
This is linear in both:
let splitAt value lst =
let rec loop l1 = function
| [] -> List.rev l1, []
| h :: t when h > value -> List.rev l1, (h :: t)
| h :: t -> loop (h :: l1) t
loop [] lst
let groupByListMaxes listMaxes numbers =
let rec loop acc lst = function
| [] -> List.rev acc
| h :: t ->
let out, lst' = splitAt h lst
loop (out :: acc) lst' t
loop [] numbers listMaxes
It can be done like this with pattern matching and tail recursion:
let groupByListMaxes listMaxes numbers =
let rec inner acc numbers =
function
| [] -> acc |> List.rev
| max::tail ->
let taken = numbers |> Seq.takeWhile ((>=) max) |> List.ofSeq
let n = taken |> List.length
inner (taken::acc) (numbers |> Seq.skip n) tail
inner [] numbers (listMaxes |> List.ofSeq)
Update: I also got inspired by fold and came up with the following solution that strictly refrains from converting the input sequences.
let groupByListMaxes maxes numbers =
let rec inner (acc, (cur, numbers)) max =
match numbers |> Seq.tryHead with
// Add n to the current list of n's less
// than the local max
| Some n when n <= max ->
let remaining = numbers |> Seq.tail
inner (acc, (n::cur, remaining)) max
// Complete the current list by adding it
// to the accumulated result and prepare
// the next list for fold.
| _ ->
(List.rev cur)::acc, ([], numbers)
maxes |> Seq.fold inner ([], ([], numbers)) |> fst |> List.rev
I have found a better implementation myself. Tips for improvements are still welcome.
Dealing with 2 sequences is really a pain. And I really do want to iterate over numbers only once without turning that sequence into a list. But then I realized that turning listMaxes (generally the shorter of the sequences) is less costly. That way only 1 sequence remains, and I can use Seq.fold over numbers.
What should be the state that we want to keep and change while iterating with Seq.fold over numbers? First, it should definitely include the remaining of the listMaxes, yet the previous maxes that we already have surpassed are no longer of interest. Second, the accumulated lists so far, although, like in the other answers, these can be kept in reverse order. More to the point: the state is a couple which has as second element a reversed list of reversed lists of the numbers so far.
let groupByListMaxes listMaxes numbers =
let rec folder state number =
match state with
| m :: maxes, _ when number > m ->
folder (maxes, List.empty :: snd state) number
| m :: maxes, [] ->
fst state, List.singleton (List.singleton number)
| m :: maxes, h :: t ->
fst state, (number :: h) :: t
| [], _ ->
failwith "Guaranteed not to happen"
let listMaxesList = List.ofSeq listMaxes
let initialState = listMaxesList, List.empty
let reversed = snd (Seq.fold folder initialState numbers)
let temp = List.rev (List.map List.rev reversed)
let extraLength = List.length listMaxesList - List.length temp
let extra = List.replicate extraLength List.empty
List.concat [temp; extra]
I know this is an old question but I had a very similar problem and I think this is a simple solution:
let groupByListMaxes cs xs =
List.scan (fun (_, xs) c -> List.partition (fun x -> x <= c) xs)
([], xs)
cs
|> List.skip 1
|> List.map fst

Folding a list in F#

I have a pretty trivial task but I can't figure out how to make the solution prettier.
The goal is taking a List and returning results, based on whether they passed a predicate. The results should be grouped. Here's a simplified example:
Predicate: isEven
Inp : [2; 4; 3; 7; 6; 10; 4; 5]
Out: [[^^^^]......[^^^^^^^^]..]
Here's the code I have so far:
let f p ls =
List.foldBack
(fun el (xs, ys) -> if p el then (el::xs, ys) else ([], xs::ys))
ls ([], [])
|> List.Cons // (1)
|> List.filter (not << List.isEmpty) // (2)
let even x = x % 2 = 0
let ret =
[2; 4; 3; 7; 6; 10; 4; 5]
|> f even
// expected [[2; 4]; [6; 10; 4]]
This code does not seem to be readable that much. Also, I don't like lines (1) and (2). Is there any better solution?
Here is my take. you need a few helper functions first:
// active pattern to choose between even and odd intengers
let (|Even|Odd|) x = if (x % 2) = 0 then Even x else Odd x
// fold function to generate a state tupple of current values and accumulated values
let folder (current, result) x =
match x, current with
| Even x, _ -> x::current, result // even members a added to current list
| Odd x, [] -> current, result // odd members are ignored when current is empty
| Odd x, _ -> [], current::result // odd members starts a new current
// test on data
[2; 4; 3; 7; 6; 10; 4; 5]
|> List.rev // reverse list since numbers are added to start of current
|> List.fold folder ([], []) // perform fold over list
|> function | [],x -> x | y,x -> y::x // check that current is List.empty, otherwise add to result
How about this one?
let folder p l = function
| h::t when p(l) -> (l::h)::t
| []::_ as a -> a
| _ as a -> []::a
let f p ls =
ls
|> List.rev
|> List.fold (fun a l -> folder p l a) [[]]
|> List.filter ((<>) [])
At least the folder is crystal clear and effective, but then you pay the price for this by list reversing.
Here is a recursive solution based on a recursive List.filter
let rec _f p ls =
match ls with
|h::t -> if p(h) then
match f p t with
|rh::rt -> (h::rh)::rt
|[] -> (h::[])::[]
else []::f p t
|[] -> [[]]
let f p ls = _f p ls |> List.filter (fun t -> t <> [])
Having to filter at the end does seem inelegant though.
Here you go. This function should also have fairly good performance.
let groupedFilter (predicate : 'T -> bool) (list : 'T list) =
(([], []), list)
||> List.fold (fun (currentGroup, finishedGroups) el ->
if predicate el then
(el :: currentGroup), finishedGroups
else
match currentGroup with
| [] ->
[], finishedGroups
| _ ->
// This is the first non-matching element
// following a matching element.
// Finish processing the previous group then
// add it to the finished groups list.
[], ((List.rev currentGroup) :: finishedGroups))
// Need to do a little clean-up after the fold.
|> fun (currentGroup, finishedGroups) ->
// If the current group is non-empty, finish it
// and add it to the list of finished groups.
let finishedGroups =
match currentGroup with
| [] -> finishedGroups
| _ ->
(List.rev currentGroup) :: finishedGroups
// Reverse the finished groups list so the grouped
// elements will be in their original order.
List.rev finishedGroups;;
With the list reversing, I would like to go to #seq instead of list.
This version uses mutation (gasp!) internally for efficiency, but may also be a little slower with the overhead of seq. I think it is quite readable though.
let f p (ls) = seq {
let l = System.Collections.Generic.List<'a>()
for el in ls do
if p el then
l.Add el
else
if l.Count > 0 then yield l |> List.ofSeq
l.Clear()
if l.Count > 0 then yield l |> List.ofSeq
}
I can't think of a way to do this elegantly using higher order functions, but here's a solution using a list comprehension. I think it's fairly straightforward to read.
let f p ls =
let rec loop xs =
[ match xs with
| [] -> ()
| x::xs when p x ->
let group, rest = collectGroup [x] xs
yield group
yield! loop rest
| _::xs -> yield! loop xs ]
and collectGroup acc = function
| x::xs when p x -> collectGroup (x::acc) xs
| xs -> List.rev acc, xs
loop ls

Split seq in F#

I should split seq<a> into seq<seq<a>> by an attribute of the elements. If this attribute equals by a given value it must be 'splitted' at that point. How can I do that in FSharp?
It should be nice to pass a 'function' to it that returns a bool if must be splitted at that item or no.
Sample:
Input sequence: seq: {1,2,3,4,1,5,6,7,1,9}
It should be splitted at every items when it equals 1, so the result should be:
seq
{
seq{1,2,3,4}
seq{1,5,6,7}
seq{1,9}
}
All you're really doing is grouping--creating a new group each time a value is encountered.
let splitBy f input =
let i = ref 0
input
|> Seq.map (fun x ->
if f x then incr i
!i, x)
|> Seq.groupBy fst
|> Seq.map (fun (_, b) -> Seq.map snd b)
Example
let items = seq [1;2;3;4;1;5;6;7;1;9]
items |> splitBy ((=) 1)
Again, shorter, with Stephen's nice improvements:
let splitBy f input =
let i = ref 0
input
|> Seq.groupBy (fun x ->
if f x then incr i
!i)
|> Seq.map snd
Unfortunately, writing functions that work with sequences (the seq<'T> type) is a bit difficult. They do not nicely work with functional concepts like pattern matching on lists. Instead, you have to use the GetEnumerator method and the resulting IEnumerator<'T> type. This often makes the code quite imperative. In this case, I'd write the following:
let splitUsing special (input:seq<_>) = seq {
use en = input.GetEnumerator()
let finished = ref false
let start = ref true
let rec taking () = seq {
if not (en.MoveNext()) then finished := true
elif en.Current = special then start := true
else
yield en.Current
yield! taking() }
yield taking()
while not (!finished) do
yield Seq.concat [ Seq.singleton special; taking()] }
I wouldn't recommend using the functional style (e.g. using Seq.skip and Seq.head), because this is quite inefficient - it creates a chain of sequences that take value from other sequence and just return it (so there is usually O(N^2) complexity).
Alternatively, you could write this using a computation builder for working with IEnumerator<'T>, but that's not standard. You can find it here, if you want to play with it.
The following is an impure implementation but yields immutable sequences lazily:
let unflatten f s = seq {
let buffer = ResizeArray()
let flush() = seq {
if buffer.Count > 0 then
yield Seq.readonly (buffer.ToArray())
buffer.Clear() }
for item in s do
if f item then yield! flush()
buffer.Add(item)
yield! flush() }
f is the function used to test whether an element should be a split point:
[1;2;3;4;1;5;6;7;1;9] |> unflatten (fun item -> item = 1)
Probably no the most efficient solution, but this works:
let takeAndSkipWhile f s = Seq.takeWhile f s, Seq.skipWhile f s
let takeAndSkipUntil f = takeAndSkipWhile (f >> not)
let rec splitOn f s =
if Seq.isEmpty s then
Seq.empty
else
let pre, post =
if f (Seq.head s) then
takeAndSkipUntil f (Seq.skip 1 s)
|> fun (a, b) ->
Seq.append [Seq.head s] a, b
else
takeAndSkipUntil f s
if Seq.isEmpty pre then
Seq.singleton post
else
Seq.append [pre] (splitOn f post)
splitOn ((=) 1) [1;2;3;4;1;5;6;7;1;9] // int list is compatible with seq<int>
The type of splitOn is ('a -> bool) -> seq<'a> -> seq>. I haven't tested it on many inputs, but it seems to work.
In case you are looking for something which actually works like split as an string split (i.e the item is not included on which the predicate returns true) the below is what I came up with.. tried to be as functional as possible :)
let fromEnum (input : 'a IEnumerator) =
seq {
while input.MoveNext() do
yield input.Current
}
let getMore (input : 'a IEnumerator) =
if input.MoveNext() = false then None
else Some ((input |> fromEnum) |> Seq.append [input.Current])
let splitBy (f : 'a -> bool) (input : 'a seq) =
use s = input.GetEnumerator()
let rec loop (acc : 'a seq seq) =
match s |> getMore with
| None -> acc
| Some x ->[x |> Seq.takeWhile (f >> not) |> Seq.toList |> List.toSeq]
|> Seq.append acc
|> loop
loop Seq.empty |> Seq.filter (Seq.isEmpty >> not)
seq [1;2;3;4;1;5;6;7;1;9;5;5;1]
|> splitBy ( (=) 1) |> printfn "%A"

Resources