Seq.fold and boolean accumulator - f#

I can never find the source code of the F# core libraries. I know it is supposedly open but google is not kind to me in helping me locate it, if so I would have looked up the impl of Seq.fold - but here goes the question.
Does anybody see any issue with the following snippet:
let success = myList |>
Seq.fold
(fun acc item -> evaluation item)
false
Logically it doesn't seem to hold water and I can and will experiment to test it. But I wanted to ask the community. If any single evaluation inside of myList retruns false, I want the success variable to be false...
So the test:
let myList = [true; true]
let success = List.fold (fun acc item -> acc && item) true myList
and
let myList = [true; false; true]
let success = List.fold (fun acc item -> acc && item) true myList
do return the proper results - I just would be more comfy seeing the source...

I think what you're looking for is something like this:
let success = myList |>
Seq.fold
(fun acc item -> acc && evaluation item)
true
This also offers "short-circut" evaluation so that if acc is false from a previous evaluation, evaluation item won't run and the expression will simply return false.
MSDN documentation for fold operator

Seq.exists will short circuit:
let success =
[1;2;3;40;5;2]
|> Seq.exists (fun item->(item>30))
|> not

I get that this is an old question, but the following may be relevant to those who have a similar question.
About the specific question here
There already exists a function that returns false as soon as one element in a Sequence is false: Seq.forAll.
So the easiest answer to the question is in fact:
let success = Seq.forAll evaluation myList
which is slightly easier to grasp than TechNeilogy’s (rewritten) answer
let success = not (Seq.exists evaluation myList)
Both in the accepted answer by Wesley Wiser and in this answer, the evaluation function is not evaluated on the items after the first item that evaluates to fold.
But, as Pascal Cuoq correctly remarked, in the accepted answer all the elements of the remainder of the list are still iterated over, which is useless.
In contrast, Seq.forAll really stops iterating when there is no use to continue. So do Seq.exists, Seq.takeWhile, …
About short-circuiting a folding in general
There are other cases where one wants to short-circuit a folding. It can be done.
Step 1: Define a folder with some kind of indication that the state won’t change during the traversal the rest of the source sequence, and the folding should be short-circuited.
Step 2: Use Seq.scan instead of Seq.fold.
Seq.scan is like Seq.fold, takes the same arguments, but computes on-demand, and returns not just the final state, but the sequence of all intermediate states and the final state.
It follows that (for finite mySequence): Seq.last (Seq.scan folder initialState mySequence) = Seq.fold folder initialState mySequence
Step 3: Use a short-circuiting function on the output of Seq.scan. Take your pick: Seq.takeWhile, Seq.forall, Seq.exists, …
In the following example, the state becomes None when a duplicate element is found, which means that the scanning may be short-circuited.
let allDistinct mySequence =
let folder state element =
match state with
| Some elementsSoFar when not (Set.contains element elementsSoFar) ->
Some (Set.add element elementsSoFar)
| _ ->
None
let initialState = Some Set.empty
let scanning = Seq.scan folder initialState mySequence
Seq.forall Option.isSome scanning

Hmmmm, I upgraded my Visual Studio and F# recently, and can't seem to locate the directory containing the F# library code. But, for what its worth, Seq.fold is equivalent to the following:
let fold f seed items =
let mutable res = seed
for item in items do
res <- f res item
res
If any single evaluation inside of
myList retruns false, I want the
success variable to be false...
It depends on how your evaluation function is implemented. If you want to return false when any of your items are false, use Seq.forall instead.

something like this
let l = [true; true; true; false; true]
let eval x = x
let x = (true, l) ||> Seq.fold(fun acc item -> acc && (eval item))
or you want to stop evaluation on first false result?
let l = [true; false; true]
l |> Seq.forall id

As for the original source, here are the fold functions from the August 10, 2010 release.
Shouldn't really need to concern yourself over the implementation, but seeing it can often be educational.
// Seq module
let fold<'T,'State> f (x:'State) (source : seq<'T>) =
checkNonNull "source" source
use e = source.GetEnumerator()
let mutable state = x
while e.MoveNext() do
state <- f state e.Current;
state
// Array module
let fold<'T,'State> (f : 'State -> 'T -> 'State) (acc: 'State) (array:'T[]) = //'
checkNonNull "array" array
let f = OptimizedClosures.FSharpFunc<_,_,_>.Adapt(f)
let mutable state = acc
let len = array.Length
for i = 0 to len - 1 do
state <- f.Invoke(state,array.[i])
state
// List module
let fold<'T,'State> f (s:'State) (list: 'T list) =
match list with
| [] -> s
| _ ->
let f = OptimizedClosures.FSharpFunc<_,_,_>.Adapt(f)
let rec loop s xs =
match xs with
| [] -> s
| h::t -> loop (f.Invoke(s,h)) t
loop s list
// MapTree module (Used by Map module)
let rec fold (f:OptimizedClosures.FSharpFunc<_,_,_,_>) x m =
match m with
| MapEmpty -> x
| MapOne(k,v) -> f.Invoke(x,k,v)
| MapNode(k,v,l,r,_) ->
let x = fold f x l
let x = f.Invoke(x,k,v)
fold f x r
// Map module
let fold<'Key,'T,'State when 'Key : comparison> f (z:'State) (m:Map<'Key,'T>) = //'
let f = OptimizedClosures.FSharpFunc<_,_,_,_>.Adapt(f)
MapTree.fold f z m.Tree
// SetTree module (Used by Set module)
let rec fold f x m =
match m with
| SetNode(k,l,r,_) ->
let x = fold f x l in
let x = f x k
fold f x r
| SetOne(k) -> f x k
| SetEmpty -> x
// Set module
let fold<'T,'State when 'T : comparison> f (z:'State) (s : Set<'T>) = //'
SetTree.fold f z s.Tree

Related

F# List.exists on two lists

I have two lists listA and listB where I want to return true if listB contains any element also in listA.
let listA = ["A";"B";"C"]
let listB = ["D";"E";"A"]
Should return true in this case. I feel like this should be easy to solve and I'm missing something fundamental somewhere.
For example, why can't I do like this?
let testIntersect = for elem in listA do List.exists (fun x -> x = elem) listB
You can't write something like your example code because a plain for doesn't return a result, it just evaluates an expression for its side-effects. You could write the code in a for comprehension:
let testIntersect listA listB =
[for elem in listA do yield List.exists (fun x -> x = elem) listB]
Of course, this then returns a bool list rather than a single bool.
val testIntersect :
listA:seq<'a> -> listB:'a list -> bool list when 'a : equality
let listA = ["A";"B";"C"]
let listB = ["D";"E";"A"]
testIntersect listA listB
val it : bool list = [true; false; false]
So, we can use the List.exists function to ensure that a true occurs at least once:
let testIntersect listA listB =
[for elem in listA do yield List.exists (fun x -> x = elem) listB]
|> List.exists id
val testIntersect :
listA:seq<'a> -> listB:'a list -> bool list when 'a : equality
val listA : string list = ["A"; "B"; "C"]
val listB : string list = ["D"; "E"; "A"]
val it : bool = false
It's pretty inefficient to solve this problem using List though, it's better to use Set. With Set, you can calculate intersection in O(log N * log M) time rather than O(N*M).
let testSetIntersect listA listB =
Set.intersect (Set.ofList listA) (Set.ofList listB)
|> Set.isEmpty
|> not
One function that you could use is List.except, which is not yet documented (!) but can be seen in this pull request that was merged a couple of years ago. You'd probably use it like this:
let testIntersect a b =
let b' = b |> List.except a
// If b' is shorter than b, then b contained at least one element of a
List.length b' < List.length b
However, this runs through list B about three times, once to do the except algorithm and once each to do both the length calls. So another approach might be to do what you did, but turn list A into a set so that the exists call won't be O(N):
let testIntersect a b =
let setA = a |> Set.ofList
match b |> List.tryFind (fun x -> setA |> Set.contains x) with
| Some _ -> true
| None -> false
The reason I used tryFind is because List.find would throw an exception if the predicate didn't match any items of the list.
Edit: An even better approach is to use List.exists, which I temporarily forgot about (thanks to Honza Brestan for reminding me about it):
let testIntersect a b =
let setA = a |> Set.ofList
b |> List.exists (fun x -> setA |> Set.contains x)
Which, of course, is pretty much what you were originally wanting to do in your testIntersect code sample. The only difference is that you were using the for ... in syntax in your code sample, which wouldn't work. In F#, the for loop is exclusively for expressions that return unit (and thus, probably have side effects). If you want to return a value, the for loop won't do that. So using the functions that do return value, like List.exists, is the approach you want to take.
let testIntersect listA listB =
(Set.ofList listA) - (Set.ofList listB) |> Set.isEmpty |> not

F# - Function like List.find but search for any of a Dictionary's keys

I want to create an F# function like List.find, but instead of searching for a single value, I want to search for any of the keys of a dictionary and return the corresponding dictionary value.
For example, this is a (poor) implementation of what I am trying to do.
let dict1=dict[(1,"A");(2,"B");(3,"C");(4,"D");(5,"E");(6,"F")]
let findInDict l =
let mutable found=false
let mutable value=""
for elem in l do
let f,v=dict1.TryGetValue(elem)
value<-if f && not found then v else value
found<-if not found then f else found
value
findInDict [9;2;5]
>
val dict1 : System.Collections.Generic.IDictionary<int,string>
val findInDict : l:seq<int> -> string
val it : string = "B"
What would be a functional equivalent?
A function for this almost feels like overkill. You can do this in one line using a list comprehension:
[for x in [9;4;5] do if dict1.ContainsKey x then yield dict1.[x]]
Edit:
After re-reading your question, I realized the above was not quite what you are looking for.
let rec findAValue l =
match l with
| [] -> None
| x::xs -> if dict1.ContainsKey x then Some(dict1.[x]) else findAValue xs
or more succinctly:
let rec findAValue = function
| [] -> None
| x::xs -> if dict1.ContainsKey x then Some(dict1.[x]) else findAValue xs
even more succinctly:
let findAValue = List.tryPick (fun x-> if dict1.ContainsKey x then Some(dict1.[x]) else None)
let highPerformanceFindAValue = List.tryPick (fun x-> match dict1.TryGetValue x with
| true, value->Some(value)
| _ -> None)
In the case where no value is found the result is None otherwise it's Some(value).
let findFirst l (dict: System.Collections.Generic.Dictionary<int, string>) =
let o = l |> List.tryFind (fun i -> dict.ContainsKey(i)) |> Option.map (fun k -> dict.[k])
match o with | None -> "" | Some(k) -> k
There are tons of ways to do this.
The obvious solution is to iterate, like you did:
let findInDict (d:IDictionary<'a, 'b>) l =
seq {
for key in l do
let f, v = d.TryGetValue(key)
if f then yield v
}
which is OK, I guess. It more or less mimics the typical step-wise approach.
You could rewrite this in terms of some sequence operators:
let findInDict1 (d:IDictionary<'a, 'b>) l =
Seq.filter (fun elem -> d.ContainsKey(elem)) l |> Seq.map (fun elem -> d.Item(elem))
which feels more functional, but is clearly doing way more work than it should be.
let findInDict2 (d:IDictionary<'a, 'b>) l =
Seq.choose(fun elem ->
let f,v = d.TryGetValue(elem)
if f then Some(v) else None) l
The last one makes the most sense in that we're only ever accessing the dictionary once per key and choose will do all the heavy lifting for us under the hood.

fold or choose till None?

Is there already a way to do something like a chooseTill or a foldTill, where it will process until a None option is received? Really, any of the higher order functions with a "till" option. Granted, it makes no sense for stuff like map, but I find I need this kind of thing pretty often and I wanted to make sure I wasn't reinventing the wheel.
In general, it'd be pretty easy to write something like this, but I'm curious if there is already a way to do this, or if this exists in some known library?
let chooseTill predicate (sequence:seq<'a>) =
seq {
let finished = ref false
for elem in sequence do
if not !finished then
match predicate elem with
| Some(x) -> yield x
| None -> finished := true
}
let foldTill predicate seed list =
let rec foldTill' acc = function
| [] -> acc
| (h::t) -> match predicate acc h with
| Some(x) -> foldTill' x t
| None -> acc
foldTill' seed list
let (++) a b = a.ToString() + b.ToString()
let abcdef = foldTill (fun acc v ->
if Char.IsWhiteSpace v then None
else Some(acc ++ v)) "" ("abcdef ghi" |> Seq.toList)
// result is "abcdef"
I think you can get that easily by combining Seq.scan and Seq.takeWhile:
open System
"abcdef ghi"
|> Seq.scan (fun (_, state) c -> c, (string c) + state) ('x', "")
|> Seq.takeWhile (fst >> Char.IsWhiteSpace >> not)
|> Seq.last |> snd
The idea is that Seq.scan is doing something like Seq.fold, but instead of waiting for the final result, it yields the intermediate states as it goes. You can then keep taking the intermediate states until you reach the end. In the above example, the state is the current character and the concatenated string (so that we can check if the character was whitespace).
A more general version based on a function that returns option could look like this:
let foldWhile f initial input =
// Generate sequence of all intermediate states
input |> Seq.scan (fun stateOpt inp ->
// If the current state is not 'None', then calculate a new one
// if 'f' returns 'None' then the overall result will be 'None'
stateOpt |> Option.bind (fun state -> f state inp)) (Some initial)
// Take only 'Some' states and get the last one
|> Seq.takeWhile Option.isSome
|> Seq.last |> Option.get

Split seq in F#

I should split seq<a> into seq<seq<a>> by an attribute of the elements. If this attribute equals by a given value it must be 'splitted' at that point. How can I do that in FSharp?
It should be nice to pass a 'function' to it that returns a bool if must be splitted at that item or no.
Sample:
Input sequence: seq: {1,2,3,4,1,5,6,7,1,9}
It should be splitted at every items when it equals 1, so the result should be:
seq
{
seq{1,2,3,4}
seq{1,5,6,7}
seq{1,9}
}
All you're really doing is grouping--creating a new group each time a value is encountered.
let splitBy f input =
let i = ref 0
input
|> Seq.map (fun x ->
if f x then incr i
!i, x)
|> Seq.groupBy fst
|> Seq.map (fun (_, b) -> Seq.map snd b)
Example
let items = seq [1;2;3;4;1;5;6;7;1;9]
items |> splitBy ((=) 1)
Again, shorter, with Stephen's nice improvements:
let splitBy f input =
let i = ref 0
input
|> Seq.groupBy (fun x ->
if f x then incr i
!i)
|> Seq.map snd
Unfortunately, writing functions that work with sequences (the seq<'T> type) is a bit difficult. They do not nicely work with functional concepts like pattern matching on lists. Instead, you have to use the GetEnumerator method and the resulting IEnumerator<'T> type. This often makes the code quite imperative. In this case, I'd write the following:
let splitUsing special (input:seq<_>) = seq {
use en = input.GetEnumerator()
let finished = ref false
let start = ref true
let rec taking () = seq {
if not (en.MoveNext()) then finished := true
elif en.Current = special then start := true
else
yield en.Current
yield! taking() }
yield taking()
while not (!finished) do
yield Seq.concat [ Seq.singleton special; taking()] }
I wouldn't recommend using the functional style (e.g. using Seq.skip and Seq.head), because this is quite inefficient - it creates a chain of sequences that take value from other sequence and just return it (so there is usually O(N^2) complexity).
Alternatively, you could write this using a computation builder for working with IEnumerator<'T>, but that's not standard. You can find it here, if you want to play with it.
The following is an impure implementation but yields immutable sequences lazily:
let unflatten f s = seq {
let buffer = ResizeArray()
let flush() = seq {
if buffer.Count > 0 then
yield Seq.readonly (buffer.ToArray())
buffer.Clear() }
for item in s do
if f item then yield! flush()
buffer.Add(item)
yield! flush() }
f is the function used to test whether an element should be a split point:
[1;2;3;4;1;5;6;7;1;9] |> unflatten (fun item -> item = 1)
Probably no the most efficient solution, but this works:
let takeAndSkipWhile f s = Seq.takeWhile f s, Seq.skipWhile f s
let takeAndSkipUntil f = takeAndSkipWhile (f >> not)
let rec splitOn f s =
if Seq.isEmpty s then
Seq.empty
else
let pre, post =
if f (Seq.head s) then
takeAndSkipUntil f (Seq.skip 1 s)
|> fun (a, b) ->
Seq.append [Seq.head s] a, b
else
takeAndSkipUntil f s
if Seq.isEmpty pre then
Seq.singleton post
else
Seq.append [pre] (splitOn f post)
splitOn ((=) 1) [1;2;3;4;1;5;6;7;1;9] // int list is compatible with seq<int>
The type of splitOn is ('a -> bool) -> seq<'a> -> seq>. I haven't tested it on many inputs, but it seems to work.
In case you are looking for something which actually works like split as an string split (i.e the item is not included on which the predicate returns true) the below is what I came up with.. tried to be as functional as possible :)
let fromEnum (input : 'a IEnumerator) =
seq {
while input.MoveNext() do
yield input.Current
}
let getMore (input : 'a IEnumerator) =
if input.MoveNext() = false then None
else Some ((input |> fromEnum) |> Seq.append [input.Current])
let splitBy (f : 'a -> bool) (input : 'a seq) =
use s = input.GetEnumerator()
let rec loop (acc : 'a seq seq) =
match s |> getMore with
| None -> acc
| Some x ->[x |> Seq.takeWhile (f >> not) |> Seq.toList |> List.toSeq]
|> Seq.append acc
|> loop
loop Seq.empty |> Seq.filter (Seq.isEmpty >> not)
seq [1;2;3;4;1;5;6;7;1;9;5;5;1]
|> splitBy ( (=) 1) |> printfn "%A"

F#: How do i split up a sequence into a sequence of sequences

Background:
I have a sequence of contiguous, time-stamped data. The data-sequence has gaps in it where the data is not contiguous. I want create a method to split the sequence up into a sequence of sequences so that each subsequence contains contiguous data (split the input-sequence at the gaps).
Constraints:
The return value must be a sequence of sequences to ensure that elements are only produced as needed (cannot use list/array/cacheing)
The solution must NOT be O(n^2), probably ruling out a Seq.take - Seq.skip pattern (cf. Brian's post)
Bonus points for a functionally idiomatic approach (since I want to become more proficient at functional programming), but it's not a requirement.
Method signature
let groupContiguousDataPoints (timeBetweenContiguousDataPoints : TimeSpan) (dataPointsWithHoles : seq<DateTime * float>) : (seq<seq< DateTime * float >>)= ...
On the face of it the problem looked trivial to me, but even employing Seq.pairwise, IEnumerator<_>, sequence comprehensions and yield statements, the solution eludes me. I am sure that this is because I still lack experience with combining F#-idioms, or possibly because there are some language-constructs that I have not yet been exposed to.
// Test data
let numbers = {1.0..1000.0}
let baseTime = DateTime.Now
let contiguousTimeStamps = seq { for n in numbers ->baseTime.AddMinutes(n)}
let dataWithOccationalHoles = Seq.zip contiguousTimeStamps numbers |> Seq.filter (fun (dateTime, num) -> num % 77.0 <> 0.0) // Has a gap in the data every 77 items
let timeBetweenContiguousValues = (new TimeSpan(0,1,0))
dataWithOccationalHoles |> groupContiguousDataPoints timeBetweenContiguousValues |> Seq.iteri (fun i sequence -> printfn "Group %d has %d data-points: Head: %f" i (Seq.length sequence) (snd(Seq.hd sequence)))
I think this does what you want
dataWithOccationalHoles
|> Seq.pairwise
|> Seq.map(fun ((time1,elem1),(time2,elem2)) -> if time2-time1 = timeBetweenContiguousValues then 0, ((time1,elem1),(time2,elem2)) else 1, ((time1,elem1),(time2,elem2)) )
|> Seq.scan(fun (indexres,(t1,e1),(t2,e2)) (index,((time1,elem1),(time2,elem2))) -> (index+indexres,(time1,elem1),(time2,elem2)) ) (0,(baseTime,-1.0),(baseTime,-1.0))
|> Seq.map( fun (index,(time1,elem1),(time2,elem2)) -> index,(time2,elem2) )
|> Seq.filter( fun (_,(_,elem)) -> elem <> -1.0)
|> PSeq.groupBy(fst)
|> Seq.map(snd>>Seq.map(snd))
Thanks for asking this cool question
I translated Alexey's Haskell to F#, but it's not pretty in F#, and still one element too eager.
I expect there is a better way, but I'll have to try again later.
let N = 20
let data = // produce some arbitrary data with holes
seq {
for x in 1..N do
if x % 4 <> 0 && x % 7 <> 0 then
printfn "producing %d" x
yield x
}
let rec GroupBy comp (input:LazyList<'a>) : LazyList<LazyList<'a>> =
LazyList.delayed (fun () ->
match input with
| LazyList.Nil -> LazyList.cons (LazyList.empty()) (LazyList.empty())
| LazyList.Cons(x,LazyList.Nil) ->
LazyList.cons (LazyList.cons x (LazyList.empty())) (LazyList.empty())
| LazyList.Cons(x,(LazyList.Cons(y,_) as xs)) ->
let groups = GroupBy comp xs
if comp x y then
LazyList.consf
(LazyList.consf x (fun () ->
let (LazyList.Cons(firstGroup,_)) = groups
firstGroup))
(fun () ->
let (LazyList.Cons(_,otherGroups)) = groups
otherGroups)
else
LazyList.cons (LazyList.cons x (LazyList.empty())) groups)
let result = data |> LazyList.of_seq |> GroupBy (fun x y -> y = x + 1)
printfn "Consuming..."
for group in result do
printfn "about to do a group"
for x in group do
printfn " %d" x
You seem to want a function that has signature
(`a -> bool) -> seq<'a> -> seq<seq<'a>>
I.e. a function and a sequence, then break up the input sequence into a sequence of sequences based on the result of the function.
Caching the values into a collection that implements IEnumerable would likely be simplest (albeit not exactly purist, but avoiding iterating the input multiple times. It will lose much of the laziness of the input):
let groupBy (fun: 'a -> bool) (input: seq) =
seq {
let cache = ref (new System.Collections.Generic.List())
for e in input do
(!cache).Add(e)
if not (fun e) then
yield !cache
cache := new System.Collections.Generic.List()
if cache.Length > 0 then
yield !cache
}
An alternative implementation could pass cache collection (as seq<'a>) to the function so it can see multiple elements to chose the break points.
A Haskell solution, because I don't know F# syntax well, but it should be easy enough to translate:
type TimeStamp = Integer -- ticks
type TimeSpan = Integer -- difference between TimeStamps
groupContiguousDataPoints :: TimeSpan -> [(TimeStamp, a)] -> [[(TimeStamp, a)]]
There is a function groupBy :: (a -> a -> Bool) -> [a] -> [[a]] in the Prelude:
The group function takes a list and returns a list of lists such that the concatenation of the result is equal to the argument. Moreover, each sublist in the result contains only equal elements. For example,
group "Mississippi" = ["M","i","ss","i","ss","i","pp","i"]
It is a special case of groupBy, which allows the programmer to supply their own equality test.
It isn't quite what we want, because it compares each element in the list with the first element of the current group, and we need to compare consecutive elements. If we had such a function groupBy1, we could write groupContiguousDataPoints easily:
groupContiguousDataPoints maxTimeDiff list = groupBy1 (\(t1, _) (t2, _) -> t2 - t1 <= maxTimeDiff) list
So let's write it!
groupBy1 :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy1 _ [] = [[]]
groupBy1 _ [x] = [[x]]
groupBy1 comp (x : xs#(y : _))
| comp x y = (x : firstGroup) : otherGroups
| otherwise = [x] : groups
where groups#(firstGroup : otherGroups) = groupBy1 comp xs
UPDATE: it looks like F# doesn't let you pattern match on seq, so it isn't too easy to translate after all. However, this thread on HubFS shows a way to pattern match sequences by converting them to LazyList when needed.
UPDATE2: Haskell lists are lazy and generated as needed, so they correspond to F#'s LazyList (not to seq, because the generated data is cached (and garbage collected, of course, if you no longer hold a reference to it)).
(EDIT: This suffers from a similar problem to Brian's solution, in that iterating the outer sequence without iterating over each inner sequence will mess things up badly!)
Here's a solution that nests sequence expressions. The imperitave nature of .NET's IEnumerable<T> is pretty apparent here, which makes it a bit harder to write idiomatic F# code for this problem, but hopefully it's still clear what's going on.
let groupBy cmp (sq:seq<_>) =
let en = sq.GetEnumerator()
let rec partitions (first:option<_>) =
seq {
match first with
| Some first' -> //'
(* The following value is always overwritten;
it represents the first element of the next subsequence to output, if any *)
let next = ref None
(* This function generates a subsequence to output,
setting next appropriately as it goes *)
let rec iter item =
seq {
yield item
if (en.MoveNext()) then
let curr = en.Current
if (cmp item curr) then
yield! iter curr
else // consumed one too many - pass it on as the start of the next sequence
next := Some curr
else
next := None
}
yield iter first' (* ' generate the first sequence *)
yield! partitions !next (* recursively generate all remaining sequences *)
| None -> () // return an empty sequence if there are no more values
}
let first = if en.MoveNext() then Some en.Current else None
partitions first
let groupContiguousDataPoints (time:TimeSpan) : (seq<DateTime*_> -> _) =
groupBy (fun (t,_) (t',_) -> t' - t <= time)
Okay, trying again. Achieving the optimal amount of laziness turns out to be a bit difficult in F#... On the bright side, this is somewhat more functional than my last attempt, in that it doesn't use any ref cells.
let groupBy cmp (sq:seq<_>) =
let en = sq.GetEnumerator()
let next() = if en.MoveNext() then Some en.Current else None
(* this function returns a pair containing the first sequence and a lazy option indicating the first element in the next sequence (if any) *)
let rec seqStartingWith start =
match next() with
| Some y when cmp start y ->
let rest_next = lazy seqStartingWith y // delay evaluation until forced - stores the rest of this sequence and the start of the next one as a pair
seq { yield start; yield! fst (Lazy.force rest_next) },
lazy Lazy.force (snd (Lazy.force rest_next))
| next -> seq { yield start }, lazy next
let rec iter start =
seq {
match (Lazy.force start) with
| None -> ()
| Some start ->
let (first,next) = seqStartingWith start
yield first
yield! iter next
}
Seq.cache (iter (lazy next()))
Below is some code that does what I think you want. It is not idiomatic F#.
(It may be similar to Brian's answer, though I can't tell because I'm not familiar with the LazyList semantics.)
But it doesn't exactly match your test specification: Seq.length enumerates its entire input. Your "test code" calls Seq.length and then calls Seq.hd. That will generate an enumerator twice, and since there is no caching, things get messed up. I'm not sure if there is any clean way to allow multiple enumerators without caching. Frankly, seq<seq<'a>> may not be the best data structure for this problem.
Anyway, here's the code:
type State<'a> = Unstarted | InnerOkay of 'a | NeedNewInner of 'a | Finished
// f() = true means the neighbors should be kept together
// f() = false means they should be split
let split_up (f : 'a -> 'a -> bool) (input : seq<'a>) =
// simple unfold that assumes f captured a mutable variable
let iter f = Seq.unfold (fun _ ->
match f() with
| Some(x) -> Some(x,())
| None -> None) ()
seq {
let state = ref (Unstarted)
use ie = input.GetEnumerator()
let innerMoveNext() =
match !state with
| Unstarted ->
if ie.MoveNext()
then let cur = ie.Current
state := InnerOkay(cur); Some(cur)
else state := Finished; None
| InnerOkay(last) ->
if ie.MoveNext()
then let cur = ie.Current
if f last cur
then state := InnerOkay(cur); Some(cur)
else state := NeedNewInner(cur); None
else state := Finished; None
| NeedNewInner(last) -> state := InnerOkay(last); Some(last)
| Finished -> None
let outerMoveNext() =
match !state with
| Unstarted | NeedNewInner(_) -> Some(iter innerMoveNext)
| InnerOkay(_) -> failwith "Move to next inner seq when current is active: undefined behavior."
| Finished -> None
yield! iter outerMoveNext }
open System
let groupContigs (contigTime : TimeSpan) (holey : seq<DateTime * int>) =
split_up (fun (t1,_) (t2,_) -> (t2 - t1) <= contigTime) holey
// Test data
let numbers = {1 .. 15}
let contiguousTimeStamps =
let baseTime = DateTime.Now
seq { for n in numbers -> baseTime.AddMinutes(float n)}
let holeyData =
Seq.zip contiguousTimeStamps numbers
|> Seq.filter (fun (dateTime, num) -> num % 7 <> 0)
let grouped_data = groupContigs (new TimeSpan(0,1,0)) holeyData
printfn "Consuming..."
for group in grouped_data do
printfn "about to do a group"
for x in group do
printfn " %A" x
Ok, here's an answer I'm not unhappy with.
(EDIT: I am unhappy - it's wrong! No time to try to fix right now though.)
It uses a bit of imperative state, but it is not too difficult to follow (provided you recall that '!' is the F# dereference operator, and not 'not'). It is as lazy as possible, and takes a seq as input and returns a seq of seqs as output.
let N = 20
let data = // produce some arbitrary data with holes
seq {
for x in 1..N do
if x % 4 <> 0 && x % 7 <> 0 then
printfn "producing %d" x
yield x
}
let rec GroupBy comp (input:seq<_>) = seq {
let doneWithThisGroup = ref false
let areMore = ref true
use e = input.GetEnumerator()
let Next() = areMore := e.MoveNext(); !areMore
// deal with length 0 or 1, seed 'prev'
if not(e.MoveNext()) then () else
let prev = ref e.Current
while !areMore do
yield seq {
while not(!doneWithThisGroup) do
if Next() then
let next = e.Current
doneWithThisGroup := not(comp !prev next)
yield !prev
prev := next
else
// end of list, yield final value
yield !prev
doneWithThisGroup := true }
doneWithThisGroup := false }
let result = data |> GroupBy (fun x y -> y = x + 1)
printfn "Consuming..."
for group in result do
printfn "about to do a group"
for x in group do
printfn " %d" x

Resources