Split seq in F# - f#

I should split seq<a> into seq<seq<a>> by an attribute of the elements. If this attribute equals by a given value it must be 'splitted' at that point. How can I do that in FSharp?
It should be nice to pass a 'function' to it that returns a bool if must be splitted at that item or no.
Sample:
Input sequence: seq: {1,2,3,4,1,5,6,7,1,9}
It should be splitted at every items when it equals 1, so the result should be:
seq
{
seq{1,2,3,4}
seq{1,5,6,7}
seq{1,9}
}

All you're really doing is grouping--creating a new group each time a value is encountered.
let splitBy f input =
let i = ref 0
input
|> Seq.map (fun x ->
if f x then incr i
!i, x)
|> Seq.groupBy fst
|> Seq.map (fun (_, b) -> Seq.map snd b)
Example
let items = seq [1;2;3;4;1;5;6;7;1;9]
items |> splitBy ((=) 1)
Again, shorter, with Stephen's nice improvements:
let splitBy f input =
let i = ref 0
input
|> Seq.groupBy (fun x ->
if f x then incr i
!i)
|> Seq.map snd

Unfortunately, writing functions that work with sequences (the seq<'T> type) is a bit difficult. They do not nicely work with functional concepts like pattern matching on lists. Instead, you have to use the GetEnumerator method and the resulting IEnumerator<'T> type. This often makes the code quite imperative. In this case, I'd write the following:
let splitUsing special (input:seq<_>) = seq {
use en = input.GetEnumerator()
let finished = ref false
let start = ref true
let rec taking () = seq {
if not (en.MoveNext()) then finished := true
elif en.Current = special then start := true
else
yield en.Current
yield! taking() }
yield taking()
while not (!finished) do
yield Seq.concat [ Seq.singleton special; taking()] }
I wouldn't recommend using the functional style (e.g. using Seq.skip and Seq.head), because this is quite inefficient - it creates a chain of sequences that take value from other sequence and just return it (so there is usually O(N^2) complexity).
Alternatively, you could write this using a computation builder for working with IEnumerator<'T>, but that's not standard. You can find it here, if you want to play with it.

The following is an impure implementation but yields immutable sequences lazily:
let unflatten f s = seq {
let buffer = ResizeArray()
let flush() = seq {
if buffer.Count > 0 then
yield Seq.readonly (buffer.ToArray())
buffer.Clear() }
for item in s do
if f item then yield! flush()
buffer.Add(item)
yield! flush() }
f is the function used to test whether an element should be a split point:
[1;2;3;4;1;5;6;7;1;9] |> unflatten (fun item -> item = 1)

Probably no the most efficient solution, but this works:
let takeAndSkipWhile f s = Seq.takeWhile f s, Seq.skipWhile f s
let takeAndSkipUntil f = takeAndSkipWhile (f >> not)
let rec splitOn f s =
if Seq.isEmpty s then
Seq.empty
else
let pre, post =
if f (Seq.head s) then
takeAndSkipUntil f (Seq.skip 1 s)
|> fun (a, b) ->
Seq.append [Seq.head s] a, b
else
takeAndSkipUntil f s
if Seq.isEmpty pre then
Seq.singleton post
else
Seq.append [pre] (splitOn f post)
splitOn ((=) 1) [1;2;3;4;1;5;6;7;1;9] // int list is compatible with seq<int>
The type of splitOn is ('a -> bool) -> seq<'a> -> seq>. I haven't tested it on many inputs, but it seems to work.

In case you are looking for something which actually works like split as an string split (i.e the item is not included on which the predicate returns true) the below is what I came up with.. tried to be as functional as possible :)
let fromEnum (input : 'a IEnumerator) =
seq {
while input.MoveNext() do
yield input.Current
}
let getMore (input : 'a IEnumerator) =
if input.MoveNext() = false then None
else Some ((input |> fromEnum) |> Seq.append [input.Current])
let splitBy (f : 'a -> bool) (input : 'a seq) =
use s = input.GetEnumerator()
let rec loop (acc : 'a seq seq) =
match s |> getMore with
| None -> acc
| Some x ->[x |> Seq.takeWhile (f >> not) |> Seq.toList |> List.toSeq]
|> Seq.append acc
|> loop
loop Seq.empty |> Seq.filter (Seq.isEmpty >> not)
seq [1;2;3;4;1;5;6;7;1;9;5;5;1]
|> splitBy ( (=) 1) |> printfn "%A"

Related

F# : filtering None out and keeping only Some

A quick question on how to effectively group/filter list/seq.
Filter for only records where the optional field is not None
Remove the "option" parameter to make future processes easier (as None has been filtered out)
Group (this is of no problem I believe)
Am I using the best approach?
Thanks!
type tmp = {
A : string
B : int option }
type tmp2 = {
A : string
B : int }
let inline getOrElse (dft: 'a) (x: 'a option) =
match x with
| Some v -> v
| _ -> dft
let getGrouped (l: tmp list) =
l |> List.filter (fun a -> a.B.IsSome)
|> List.map (fun a -> {A = a.A ; B = (getOrElse 0 (a.B)) })
|> List.groupBy (fun a -> a.A)
The most natural approach for map+filter when option is involved is to use choose, which combines those two operations and drops the option wrapper from the filtered output.
Your example would look something like this:
let getGrouped (l: tmp list) =
l
|> List.choose (fun a ->
a.B
|> Option.map (fun b -> {A = a.A; B = b})
|> List.groupBy (fun a -> a.A)
The simple solution is just use the property that an option can be transformed to list with one or zero elements then you can define a function like:
let t1 ({A=a; B=b} : tmp) =
match b with
| (Some i) -> [{ A = a; B= i}]
| _ -> []
let getGrouped (l: tmp list) =
l |> List.collect t1
|> List.groupBy (fun a -> a.A)

How to create a dependency between observables?

I want a tool for testing Rx components that would work like this:
Given an order of the events specified as a 'v seq and a key selector function (keySelector :: 'v -> 'k) I want to create a Map<'k, IObservable<'k>> where the guarantee is that the groupped observables yield the values in the global order defined by the above enumerable.
For example:
makeObservables isEven [1;2;3;4;5;6]
...should produce
{ true : -2-4-6|,
false: 1-3-5| }
This is my attempt looks like this:
open System
open System.Reactive.Linq
open FSharp.Control.Reactive
let subscribeAfter (o1: IObservable<'a>) (o2 : IObservable<'b>) : IObservable<'b> =
fun (observer : IObserver<'b>) ->
let tempObserver = { new IObserver<'a> with
member this.OnNext x = ()
member this.OnError e = observer.OnError e
member this.OnCompleted () = o2 |> Observable.subscribeObserver observer |> ignore
}
o1.Subscribe tempObserver
|> Observable.Create
let makeObservables (keySelector : 'a -> 'k) (xs : 'a seq) : Map<'k, IObservable<'a>> =
let makeDependencies : ('k * IObservable<'a>) seq -> ('k * IObservable<'a>) seq =
let makeDep ((_, o1), (k2, o2)) = (k2, subscribeAfter o1 o2)
Seq.pairwise
>> Seq.map makeDep
let makeObservable x = (keySelector x, Observable.single x)
let firstItem =
Seq.head xs
|> makeObservable
|> Seq.singleton
let dependentObservables =
xs
|> Seq.map makeObservable
|> makeDependencies
dependentObservables
|> Seq.append firstItem
|> Seq.groupBy fst
|> Seq.map (fun (k, obs) -> (k, obs |> Seq.map snd |> Observable.concatSeq))
|> Map.ofSeq
[<EntryPoint>]
let main argv =
let isEven x = (x % 2 = 0)
let splits : Map<bool, IObservable<int>> =
[1;2;3;4;5]
|> makeObservables isEven
use subscription =
splits
|> Map.toSeq
|> Seq.map snd
|> Observable.mergeSeq
|> Observable.subscribe (printfn "%A")
Console.ReadKey() |> ignore
0 // return an integer exit code
...but the results are not as expected and the observed values are not in the global order.
Apparently the items in each group are yield correctly but when the groups are merged its more like a concat then a merge
The expected output is: 1 2 3 4 5
...but the actual output is 1 3 5 2 4
What am I doing wrong?
Thanks!
You describe wanting this:
{ true : -2-4-6|,
false: 1-3-5| }
But you're really creating this:
{ true : 246|,
false: 135| }
Since there's no time gaps between the items in the observables, the merge basically has a constant race condition. Rx guarantees that element 1 of a given sequence will fire before element 2, but Merge offers no guarantees around cases like this.
You need to introduce time gaps into your observables if you want Merge to be able to re-sequence in the original order.

Map every 2 elements to a function/constructor

I have an integer array in which I want to send every two elements from it to the constructor of another function.
Something like intArray |> Array.map (fun x, y -> new Point(x, y))
Is this possible? I'm new to F# and functional programming so I'm trying to avoid just looping through every 2 items in the array and adding the point to a list. I hope that's reasonable.
If using F# 4.0, use Gustavo's approach. For F# 3, you can do:
intArray
|> Seq.pairwise // get sequence of tuples of element (1,2); (2,3); (3,4); (4,5) etc
|> Seq.mapi (fun i xy -> i, xy) // combine the index with the tuple
|> Seq.filter (fun (i,_) -> i % 2 = 0) // Filter for only the even indices to get (1,2); (3,4)
|> Seq.map (fun xy -> Point xy) // make a point from the tuples
|> Array.ofSeq // convert back to array
You can use Array.chunkBySize:
intArray
|> Array.chunkBySize 2
|> Array.map (function
| [|x; y|] -> new Point (x, y)
| _ -> failwith "Array length is not even.")
An alternative solution to the existing answers, would be to write a custom function, that creates a list/an array of tuples using pattern matching:
let chunkify arr =
let rec chunkify acc lst =
if (List.length lst) > 1 then (* proceed if there are at least two elements *)
match lst with
(* save every constructed pair, until the input is not empty *)
| h1 :: h2 :: tail -> chunkify ([(h1, h2)] # acc) tail
| _ -> acc (* else return the list of pairs *)
else (* return the list of pairs *)
acc
chunkify List.empty (List.ofSeq arr) |> List.rev |> Array.ofSeq
The function can be then used like this:
// helper
let print = (fun (x:'a, y:'a) -> printfn "new Object(%A,%A)" x y)
// ints
[|1;2;3;4;5;6|]
|> chunkify
|> Array.iter print
// strings
[|"a";"b";"c";"d";"e"|]
|> chunkify
|> Array.map print
|> ignore
The output is:
new Object(1,2)
new Object(3,4)
new Object(5,6)
new Object("a","b")
new Object("c","d")
This solution/approach uses pattern matching with lists.

How do I write a ZipN-like function in F#?

I want to create a function with the signature seq<#seq<'a>> ->seq<seq<'a>> that acts like a Zip method taking a sequence of an arbitrary number of input sequences (instead of 2 or 3 as in Zip2 and Zip3) and returning a sequence of sequences instead of tuples as a result.
That is, given the following input:
[[1;2;3];
[4;5;6];
[7;8;9]]
it will return the result:
[[1;4;7];
[2;5;8];
[3;6;9]]
except with sequences instead of lists.
I am very new to F#, but I have created a function that does what I want, but I know it can be improved. It's not tail recursive and it seems like it could be simpler, but I don't know how yet. I also haven't found a good way to get the signature the way I want (accepting, e.g., an int list list as input) without a second function.
I know this could be implemented using enumerators directly, but I'm interested in doing it in a functional manner.
Here's my code:
let private Tail seq = Seq.skip 1 seq
let private HasLengthNoMoreThan n = Seq.skip n >> Seq.isEmpty
let rec ZipN_core = function
| seqs when seqs |> Seq.isEmpty -> Seq.empty
| seqs when seqs |> Seq.exists Seq.isEmpty -> Seq.empty
| seqs ->
let head = seqs |> Seq.map Seq.head
let tail = seqs |> Seq.map Tail |> ZipN_core
Seq.append (Seq.singleton head) tail
// Required to change the signature of the parameter from seq<seq<'a> to seq<#seq<'a>>
let ZipN seqs = seqs |> Seq.map (fun x -> x |> Seq.map (fun y -> y)) |> ZipN_core
let zipn items = items |> Matrix.Generic.ofSeq |> Matrix.Generic.transpose
Or, if you really want to write it yourself:
let zipn items =
let rec loop items =
seq {
match items with
| [] -> ()
| _ ->
match zipOne ([], []) items with
| Some(xs, rest) ->
yield xs
yield! loop rest
| None -> ()
}
and zipOne (acc, rest) = function
| [] -> Some(List.rev acc, List.rev rest)
| []::_ -> None
| (x::xs)::ys -> zipOne (x::acc, xs::rest) ys
loop items
Since this seems to be the canonical answer for writing a zipn in f#, I wanted to add a "pure" seq solution that preserves laziness and doesn't force us to load our full source sequences in memory at once like the Matrix.transpose function. There are scenarios where this is very important because it's a) faster and b) works with sequences that contain 100s of MBs of data!
This is probably the most un-idiomatic f# code I've written in a while but it gets the job done (and hey, why would there be sequence expressions in f# if you couldn't use them for writing procedural code in a functional language).
let seqdata = seq {
yield Seq.ofList [ 1; 2; 3 ]
yield Seq.ofList [ 4; 5; 6 ]
yield Seq.ofList [ 7; 8; 9 ]
}
let zipnSeq (src:seq<seq<'a>>) = seq {
let enumerators = src |> Seq.map (fun x -> x.GetEnumerator()) |> Seq.toArray
if (enumerators.Length > 0) then
try
while(enumerators |> Array.forall(fun x -> x.MoveNext())) do
yield enumerators |> Array.map( fun x -> x.Current)
finally
enumerators |> Array.iter (fun x -> x.Dispose())
}
zipnSeq seqdata |> Seq.toArray
val it : int [] [] = [|[|1; 4; 7|]; [|2; 5; 8|]; [|3; 6; 9|]|]
By the way, the traditional matrix transpose is much more terse than #Daniel's answer. Though, it requires a list or LazyList that both will eventually have the full sequence in memory.
let rec transpose =
function
| (_ :: _) :: _ as M -> List.map List.head M :: transpose (List.map List.tail M)
| _ -> []
To handle having sub-lists of different lengths, I've used option types to spot if we've run out of elements.
let split = function
| [] -> None, []
| h::t -> Some(h), t
let rec zipN listOfLists =
seq { let splitted = listOfLists |> List.map split
let anyMore = splitted |> Seq.exists (fun (f, _) -> f.IsSome)
if anyMore then
yield splitted |> List.map fst
let rest = splitted |> List.map snd
yield! rest |> zipN }
This would map
let ll = [ [ 1; 2; 3 ];
[ 4; 5; 6 ];
[ 7; 8; 9 ] ]
to
seq
[seq [Some 1; Some 4; Some 7]; seq [Some 2; Some 5; Some 8];
seq [Some 3; Some 6; Some 9]]
and
let ll = [ [ 1; 2; 3 ];
[ 4; 5; 6 ];
[ 7; 8 ] ]
to
seq
[seq [Some 1; Some 4; Some 7]; seq [Some 2; Some 5; Some 8];
seq [Some 3; Some 6; null]]
This takes a different approach to yours, but avoids using some of the operations that you had before (e.g. Seq.skip, Seq.append), which you should be careful with.
I realize that this answer is not very efficient, but I do like its succinctness:
[[1;2;3]; [4;5;6]; [7;8;9]]
|> Seq.collect Seq.indexed
|> Seq.groupBy fst
|> Seq.map (snd >> Seq.map snd);;
Another option:
let zipN ls =
let rec loop (a,b) =
match b with
|l when List.head l = [] -> a
|l ->
let x1,x2 =
(([],[]),l)
||> List.fold (fun acc elem ->
match acc,elem with
|(ah,at),eh::et -> ah#[eh],at#[et]
|_ -> acc)
loop (a#[x1],x2)
loop ([],ls)

f# array.filter based on a bool array

if I have array A, and I have another bool array isChosen with the same length of A how can I build a new array from A where isChosen is true? something like A.[isChosen]? I cannot use Array.filter directly since isChosen is not a function of A elements and there is no Array.filteri like Array.mapi.
zip should help:
let l = [|1;2;3|]
let f = [|true; false; true|]
let r = [| for (v, f) in Seq.zip l f do if f then yield v|]
// or
let r = (l, f) ||> Seq.zip |> Seq.filter snd |> Seq.map fst |> Seq.toArray
Try the zip operator
seq.zip A isChosen
|> Seq.filter snd
|> Seq.map fst
|> Array.ofSeq
This will create a sequence of tuples where one value is from A and the other is from isChosen. This will pair the values together and make it very easy to filter them out in a Seq.filter expression
It's not as elegant or 'functional' as the other answers, but every once in a while I like a gentle reminder that you can use loops and array indices in F#:
let A = [|1;2;3|]
let isChosen = [|true; false; true|]
let r = [| for i in 0..A.Length-1 do
if isChosen.[i] then
yield A.[i] |]
printfn "%A" r
:)
And here are two more ways, just to demonstrate (even) more F# library functions:
let A = [|1;2;3|]
let isChosen = [|true;false;true|]
let B = Seq.map2 (fun x b -> if b then Some x else None) A isChosen
|> Seq.choose id
|> Seq.toArray
let C = Array.foldBack2 (fun x b acc -> if b then x::acc else acc) A isChosen []
|> List.toArray
My personal favorite for understandability (and therefore maintainability): desco's answer
let r = [| for (v, f) in Seq.zip l f do if f then yield v|]

Resources