Find mean of list in F# - f#

I'm trying to find the average of a list
Here's what I have so far
let rec avg aList =
match aList with
| head::tail -> head+avg(tail)
| [] -> 0
This obtains the sum. I've tried head+avg(tail)/aList.Length, but it gives me an incorrect result as I don't know exactly what that's doing
Any help would be appreciated

For an average, you'd want two things, the sum and the number of items. Using List.length would mean traversing the list again.
We can do those two things at the same time - by using a tuple.
This operation is known as folding (or sometimes aggregation). We apply the folding function, gathering our state as we traverse the list without mutating anything.
let avg aList =
let rec accumulate (sum, count) list =
match list with
| head::tail -> accumulate (sum + head, count + 1) tail
| [] -> (sum, count)
let sum, count = accumulate (0, 0) aList
let average = sum / count
average
You can generalize this using fold.
let avg aList =
let sum, count =
List.fold (fun (sum, count) current -> (sum + current, count + 1)) (0,0) aList
let average = sum / count
average
The generic math version:
let inline avg (list: 'a list) =
let rec accumulate (sum, count) list =
match list with
| head::tail -> accumulate (Checked.(+) sum head, count + 1) tail
| [] -> (sum, count)
let sum, count = accumulate (LanguagePrimitives.GenericZero<'a>, 0) list
let average = LanguagePrimitives.DivideByInt sum count
average

The simplest way of doing this is using the average high-order function from the List module. You can do this on a single line with
let myAverage = aList |> List.average

let avg aList =
let rec sum = function
| head :: tail -> head + (sum tail)
| [] -> 0.
sum aList / (aList |> List.length |> float)
let res = avg [ 2.; 4.; 6. ]
printfn "%A" res
I'm not sure this is the best way to do it tho.

Related

F#: Not understanding match .. with

I'm messing around with F# and Fable, and trying to test my understanding. To do so, I tried creating a function to calculate e given a certain number of iterations. What I've come up with is
let eCalc n =
let rec internalECalc ifact sum count =
match count = n with
| true -> sum
| _ -> internalECalc (ifact / (float count)) (sum + ifact) (count+1)
internalECalc 1.0 0.0 1
Which works fine, returning 2.7182818284590455 when called with
eCalc 20
However, if I try using, what I think is, the more correct form
let eCalc n =
let rec internalECalc ifact sum count =
match count with
| n -> sum
| _ -> internalECalc (ifact / (float count)) (sum + ifact) (count+1)
internalECalc 1.0 0.0 1
I get a warning "[WARNING] This rule will never be matched (L5,10-L5,11)", and returned value of 0. (and the same thing happens if I swap 'n' and 'count' in the match statement). Is there a reason I can't use 'n' in the match statement? Is there a way around this so I can use 'n'?
Thanks
When you use a name in a match statement, you're not checking it against the value assigned to that variable the way you think you are. You are instead assigning that name. I.e.,
match someInt with
| n -> printfn "%d" n
will print the value of someInt. It's the equivalent of let n = someInt; printfn "%d" n.
What you wanted to do was use a when clause; inside a when clause, you're not pattern-matching, but doing a "standard" if check. So what you wanted was:
let eCalc n =
let rec internalECalc ifact sum count =
match count with
| cnt when cnt = n -> sum
| _ -> internalECalc (ifact / (float count)) (sum + ifact) (count+1)
internalECalc 1.0 0.0 1
Does that make sense, or do you need me to go into more detail?
P.S. In a case like this one where your match function looks like "x when (boolean condition involving x) -> case 1 | _ -> case 2", it's quite a bit more readable to use a simple if expression:
let eCalc n =
let rec internalECalc ifact sum count =
if count = n then
sum
else
internalECalc (ifact / (float count)) (sum + ifact) (count+1)
internalECalc 1.0 0.0 1

F# infinite stream of armstrong numbers

I'm trying to create an infinite Stream in F# that contains armstrong numbers. An armstrong number is one whose cubes of its digits add up to the number. For example, 153 is an armstrong number because 1^3 + 5^3 + 3^3 = 153. so far, I have created several functions to help me do so. They are:
type 'a stream = Cons of 'a * (unit -> 'a stream);;
let rec upfrom n = Cons (n, fun() -> upfrom (n+1));;
let rec toIntArray = function
| 0 -> []
| n -> n % 10 :: toIntArray (n / 10);;
 
let rec makearmstrong = function
| [] -> 0
| y::ys -> (y * y * y) + makearmstrong ys;;
let checkarmstrong n = n = makearmstrong(toIntArray n);;
let rec take n (Cons(x,xsf)) =
match n with
| 0 -> []
| _ -> x :: take (n-1)(xsf());;
let rec filter p (Cons (x, xsf)) =
if p x then Cons (x, fun() -> filter p (xsf()))
else filter p (xsf());;
And finally:
let armstrongs = filter (fun n -> checkarmstrong n)(upfrom 1);;
Now, when I do take 4 armstrongs;;, (or any number less than 4) this works perfectly and gives me [1;153;370;371] but if I do take 5 armstrongs;;nothing happens, it seems like the program freezes.
I believe the issue is that there are no numbers after 407 that are the sums of their cubes (see http://oeis.org/A046197), but when your code evaluates the equivalent of take 1 (Cons(407, filter checkarmstrong (upfrom 408))) it's going to force the evaluation of the tail and filter will recurse forever, never finding a matching next element. Also note that your definition of Armstrong numbers differs from, say, Wikipedia's, which states that the power the digits are raised to should be the number of digits in the number.

Finding (and removing) repeating pairs in array

I want a way to get rid of repeating pairs in an array. For my problem, the pairs will be consecutive, and there will be at most one repeating pair.
My current implementation seems too complicated. The elements 3 and 4 form what I'm calling a repeating pair in arr1 below. As a pair, they only appear once in the desired output, arr2. What are some more efficient ways?
let arr1=[|4; 2; 3; 4; 3; 4; 1|]
let n=arr1.Length
let iPlus2IsEqual=Array.map2 (fun x y -> x=y) arr1.[2..] arr1.[..(n-3)]
let consecutive=Array.map2 (fun x y -> x && y) iPlus2IsEqual.[1..] iPlus2IsEqual.[..(n-4)] |> Array.tryFindIndex (fun x -> x)
let dup=if consecutive.IsSome then consecutive.Value+1 else n-1
let arr2=if dup>=n-3 then arr1.[..dup] else Array.append arr1.[..dup] arr1.[(dup+3)..]
>
val arr2 : int [] = [|4; 2; 3; 4; 1|]
We can use recursion like so (it will get multiple repeats for free too)
let rec filterrepeats l =
match l with
|a::b::c::d::t when a=c && b=d -> a::b::(filterrepeats t)
|h::t ->h::(filterrepeats t)
|[] -> []
> filterrepeats [4;2;3;4;3;4;1];;
val it : int list = [4; 2; 3; 4; 1]
This works on lists, so you will need to add a call to Array.toList before you run it.
The above is not tail recursive as the compiler doesn't know what goes on the right hand side of h::(filterrepeats t) until after the function call. You can solve this by using an accumulator like so:
let rec filterrepeats l =
let rec loop l acc =
match l with
|a::b::c::d::t when a=c && b=d ->loop t (b::a::acc)
|h::t ->loop t (h::acc)
|[] -> acc
loop (List.rev l) []
For large arrays this is around 13x faster than your solution:
let inline tryFindDuplicatedPairIndex (xs: _ []) =
let rec loop i x0 x1 x2 =
if i < xs.Length-4 then
let x3 = xs.[i+3]
if x0=x2 && x1=x3 then Some i else
loop (i+1) x1 x2 x3
else None
if xs.Length < 4 then None else
loop 0 xs.[0] xs.[1] xs.[2]
let inline removeDuplicatedPair (xs: _ []) =
match tryFindDuplicatedPairIndex xs with
| None -> Array.copy xs
| Some i ->
let ys = Array.zeroCreate (xs.Length-2)
for j=0 to i-1 do
ys.[j] <- xs.[j]
for j=i+2 to xs.Length-1 do
ys.[j-2] <- xs.[j]
ys
I use inline and test elements individually (i.e. rather than as a tuple: (x0,x1) = (x2,x3)) to try to prevent = from being a generic equality test because that is very slow. I've reused previous array lookups from one iteration to the next. I copy the input array if the output is identical to the input and pre-allocate an array with n-2 elements otherwise. I've hand-rolled the copying to my pre-allocated array to avoid creating any garbage (e.g. instead of Array.append of two slices).
No stack overflow with large list (length >= 100K) and remove all duplicate pairs
let rec distinctPairs list =
List.foldBack (fun x (l,r) -> x::r, l) list ([],[])
|> fun (odds, evens) -> List.zip odds evens
|> Seq.distinct
Not very fast, 1M list take 500ms, anyway faster ?
Only work for list with even length

Take N elements from sequence with N different indexes in F#

I'm new to F# and looking for a function which take N*indexes and a sequence and gives me N elements. If I have N indexes it should be equal to concat Seq.nth index0, Seq.nth index1 .. Seq.nth indexN but it should only scan over indexN elements (O(N)) in the sequence and not index0+index1+...+indexN (O(N^2)).
To sum up, I'm looking for something like:
//For performance, the index-list should be ordered on input, be padding between elements instead of indexes or be ordered when entering the function
seq {10 .. 20} |> Seq.takeIndexes [0;5;10]
Result: 10,15,20
I could make this by using seq { yield... } and have a index-counter to tick when some element should be passed out but if F# offers a nice standard way I would rather use that.
Thanks :)...
Addition: I have made the following. It works but ain't pretty. Suggestions is welcomed
let seqTakeIndexes (indexes : int list) (xs : seq<int>) =
seq {
//Assume indexes is sorted
let e = xs.GetEnumerator()
let i = ref indexes
let curr = ref 0
while e.MoveNext() && not (!i).IsEmpty do
if !curr = List.head !i then
i := (!i).Tail
yield e.Current
curr := !curr + 1
}
When you want to access elements by index, then using sequences isn't as good idea. Sequences are designed to allow sequential iteration. I would convert the necessary part of the sequence to an array and then pick the elements by index:
let takeIndexes ns input =
// Take only elements that we need to access (sequence could be infinite)
let arr = input |> Seq.take (1 + Seq.max ns) |> Array.ofSeq
// Simply pick elements at the specified indices from the array
seq { for index in ns -> arr.[index] }
seq [10 .. 20] |> takeIndexes [0;5;10]
Regarding your implementation - I don't think it can be made significantly more elegant. This is a general problem when implementing functions that need to take values from multiple sources in an interleaved fashion - there is just no elegant way of writing those!
However, you can write this in a functional way using recursion like this:
let takeIndexes indices (xs:seq<int>) =
// Iterates over the list of indices recursively
let rec loop (xe:IEnumerator<_>) idx indices = seq {
let next = loop xe (idx + 1)
// If the sequence ends, then end as well
if xe.MoveNext() then
match indices with
| i::indices when idx = i ->
// We're passing the specified index
yield xe.Current
yield! next indices
| _ ->
// Keep waiting for the first index from the list
yield! next indices }
seq {
// Note: 'use' guarantees proper disposal of the source sequence
use xe = xs.GetEnumerator()
yield! loop xe 0 indices }
seq [10 .. 20] |> takeIndexes [0;5;10]
When you need to scan a sequence and accumulate results in O(n), you can always fall back to Seq.fold:
let takeIndices ind sq =
let selector (idxLeft, currIdx, results) elem =
match idxLeft with
| [] -> (idxLeft, currIdx, results)
| idx::moreIdx when idx = currIdx -> (moreIdx, currIdx+1, elem::results)
| idx::_ when idx <> currIdx -> (idxLeft, currIdx+1, results)
| idx::_ -> invalidOp "Can't get here."
let (_, _, results) = sq |> Seq.fold selector (ind, 0, [])
results |> List.rev
seq [10 .. 20] |> takeIndices [0;5;10]
The drawback of this solution is that it will enumerate the sequence to the end, even if it has accumulated all the desired elements already.
Here is my shot at this. This solution will only go as far as it needs into the sequence and returns the elements as a list.
let getIndices xs (s:_ seq) =
let enum = s.GetEnumerator()
let rec loop i acc = function
| h::t as xs ->
if enum.MoveNext() then
if i = h then
loop (i+1) (enum.Current::acc) t
else
loop (i+1) acc xs
else
raise (System.IndexOutOfRangeException())
| _ -> List.rev acc
loop 0 [] xs
[10..20]
|> getIndices [2;4;8]
// Returns [12;14;18]
The only assumption made here is that the index list you supply is sorted. The function won't work properly otherwise.
Is it a problem, that the returned result is sorted?
This algorithm will work linearly over the input sequence. Just the indices need to be sorted. If the sequence is large, but indices are not so many - it'll be fast.
Complexity is: N -> Max(indices), M -> count of indices: O(N + MlogM) in the worst case.
let seqTakeIndices indexes =
let rec gather prev idxs xs =
match idxs with
| [] -> Seq.empty
| n::ns -> seq { let left = xs |> Seq.skip (n - prev)
yield left |> Seq.head
yield! gather n ns left }
indexes |> List.sort |> gather 0
Here is a List.fold variant, but is more complex to read. I prefer the first:
let seqTakeIndices indices xs =
let gather (prev, xs, res) n =
let left = xs |> Seq.skip (n - prev)
n, left, (Seq.head left)::res
let _, _, res = indices |> List.sort |> List.fold gather (0, xs, [])
res
Appended: Still slower than your variant, but a lot faster than mine older variants. Because of not using Seq.skip that is creating new enumerators and was slowing down things a lot.
let seqTakeIndices indices (xs : seq<_>) =
let enum = xs.GetEnumerator()
enum.MoveNext() |> ignore
let rec gather prev idxs =
match idxs with
| [] -> Seq.empty
| n::ns -> seq { if [1..n-prev] |> List.forall (fun _ -> enum.MoveNext()) then
yield enum.Current
yield! gather n ns }
indices |> List.sort |> gather 0

F#: How do i split up a sequence into a sequence of sequences

Background:
I have a sequence of contiguous, time-stamped data. The data-sequence has gaps in it where the data is not contiguous. I want create a method to split the sequence up into a sequence of sequences so that each subsequence contains contiguous data (split the input-sequence at the gaps).
Constraints:
The return value must be a sequence of sequences to ensure that elements are only produced as needed (cannot use list/array/cacheing)
The solution must NOT be O(n^2), probably ruling out a Seq.take - Seq.skip pattern (cf. Brian's post)
Bonus points for a functionally idiomatic approach (since I want to become more proficient at functional programming), but it's not a requirement.
Method signature
let groupContiguousDataPoints (timeBetweenContiguousDataPoints : TimeSpan) (dataPointsWithHoles : seq<DateTime * float>) : (seq<seq< DateTime * float >>)= ...
On the face of it the problem looked trivial to me, but even employing Seq.pairwise, IEnumerator<_>, sequence comprehensions and yield statements, the solution eludes me. I am sure that this is because I still lack experience with combining F#-idioms, or possibly because there are some language-constructs that I have not yet been exposed to.
// Test data
let numbers = {1.0..1000.0}
let baseTime = DateTime.Now
let contiguousTimeStamps = seq { for n in numbers ->baseTime.AddMinutes(n)}
let dataWithOccationalHoles = Seq.zip contiguousTimeStamps numbers |> Seq.filter (fun (dateTime, num) -> num % 77.0 <> 0.0) // Has a gap in the data every 77 items
let timeBetweenContiguousValues = (new TimeSpan(0,1,0))
dataWithOccationalHoles |> groupContiguousDataPoints timeBetweenContiguousValues |> Seq.iteri (fun i sequence -> printfn "Group %d has %d data-points: Head: %f" i (Seq.length sequence) (snd(Seq.hd sequence)))
I think this does what you want
dataWithOccationalHoles
|> Seq.pairwise
|> Seq.map(fun ((time1,elem1),(time2,elem2)) -> if time2-time1 = timeBetweenContiguousValues then 0, ((time1,elem1),(time2,elem2)) else 1, ((time1,elem1),(time2,elem2)) )
|> Seq.scan(fun (indexres,(t1,e1),(t2,e2)) (index,((time1,elem1),(time2,elem2))) -> (index+indexres,(time1,elem1),(time2,elem2)) ) (0,(baseTime,-1.0),(baseTime,-1.0))
|> Seq.map( fun (index,(time1,elem1),(time2,elem2)) -> index,(time2,elem2) )
|> Seq.filter( fun (_,(_,elem)) -> elem <> -1.0)
|> PSeq.groupBy(fst)
|> Seq.map(snd>>Seq.map(snd))
Thanks for asking this cool question
I translated Alexey's Haskell to F#, but it's not pretty in F#, and still one element too eager.
I expect there is a better way, but I'll have to try again later.
let N = 20
let data = // produce some arbitrary data with holes
seq {
for x in 1..N do
if x % 4 <> 0 && x % 7 <> 0 then
printfn "producing %d" x
yield x
}
let rec GroupBy comp (input:LazyList<'a>) : LazyList<LazyList<'a>> =
LazyList.delayed (fun () ->
match input with
| LazyList.Nil -> LazyList.cons (LazyList.empty()) (LazyList.empty())
| LazyList.Cons(x,LazyList.Nil) ->
LazyList.cons (LazyList.cons x (LazyList.empty())) (LazyList.empty())
| LazyList.Cons(x,(LazyList.Cons(y,_) as xs)) ->
let groups = GroupBy comp xs
if comp x y then
LazyList.consf
(LazyList.consf x (fun () ->
let (LazyList.Cons(firstGroup,_)) = groups
firstGroup))
(fun () ->
let (LazyList.Cons(_,otherGroups)) = groups
otherGroups)
else
LazyList.cons (LazyList.cons x (LazyList.empty())) groups)
let result = data |> LazyList.of_seq |> GroupBy (fun x y -> y = x + 1)
printfn "Consuming..."
for group in result do
printfn "about to do a group"
for x in group do
printfn " %d" x
You seem to want a function that has signature
(`a -> bool) -> seq<'a> -> seq<seq<'a>>
I.e. a function and a sequence, then break up the input sequence into a sequence of sequences based on the result of the function.
Caching the values into a collection that implements IEnumerable would likely be simplest (albeit not exactly purist, but avoiding iterating the input multiple times. It will lose much of the laziness of the input):
let groupBy (fun: 'a -> bool) (input: seq) =
seq {
let cache = ref (new System.Collections.Generic.List())
for e in input do
(!cache).Add(e)
if not (fun e) then
yield !cache
cache := new System.Collections.Generic.List()
if cache.Length > 0 then
yield !cache
}
An alternative implementation could pass cache collection (as seq<'a>) to the function so it can see multiple elements to chose the break points.
A Haskell solution, because I don't know F# syntax well, but it should be easy enough to translate:
type TimeStamp = Integer -- ticks
type TimeSpan = Integer -- difference between TimeStamps
groupContiguousDataPoints :: TimeSpan -> [(TimeStamp, a)] -> [[(TimeStamp, a)]]
There is a function groupBy :: (a -> a -> Bool) -> [a] -> [[a]] in the Prelude:
The group function takes a list and returns a list of lists such that the concatenation of the result is equal to the argument. Moreover, each sublist in the result contains only equal elements. For example,
group "Mississippi" = ["M","i","ss","i","ss","i","pp","i"]
It is a special case of groupBy, which allows the programmer to supply their own equality test.
It isn't quite what we want, because it compares each element in the list with the first element of the current group, and we need to compare consecutive elements. If we had such a function groupBy1, we could write groupContiguousDataPoints easily:
groupContiguousDataPoints maxTimeDiff list = groupBy1 (\(t1, _) (t2, _) -> t2 - t1 <= maxTimeDiff) list
So let's write it!
groupBy1 :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy1 _ [] = [[]]
groupBy1 _ [x] = [[x]]
groupBy1 comp (x : xs#(y : _))
| comp x y = (x : firstGroup) : otherGroups
| otherwise = [x] : groups
where groups#(firstGroup : otherGroups) = groupBy1 comp xs
UPDATE: it looks like F# doesn't let you pattern match on seq, so it isn't too easy to translate after all. However, this thread on HubFS shows a way to pattern match sequences by converting them to LazyList when needed.
UPDATE2: Haskell lists are lazy and generated as needed, so they correspond to F#'s LazyList (not to seq, because the generated data is cached (and garbage collected, of course, if you no longer hold a reference to it)).
(EDIT: This suffers from a similar problem to Brian's solution, in that iterating the outer sequence without iterating over each inner sequence will mess things up badly!)
Here's a solution that nests sequence expressions. The imperitave nature of .NET's IEnumerable<T> is pretty apparent here, which makes it a bit harder to write idiomatic F# code for this problem, but hopefully it's still clear what's going on.
let groupBy cmp (sq:seq<_>) =
let en = sq.GetEnumerator()
let rec partitions (first:option<_>) =
seq {
match first with
| Some first' -> //'
(* The following value is always overwritten;
it represents the first element of the next subsequence to output, if any *)
let next = ref None
(* This function generates a subsequence to output,
setting next appropriately as it goes *)
let rec iter item =
seq {
yield item
if (en.MoveNext()) then
let curr = en.Current
if (cmp item curr) then
yield! iter curr
else // consumed one too many - pass it on as the start of the next sequence
next := Some curr
else
next := None
}
yield iter first' (* ' generate the first sequence *)
yield! partitions !next (* recursively generate all remaining sequences *)
| None -> () // return an empty sequence if there are no more values
}
let first = if en.MoveNext() then Some en.Current else None
partitions first
let groupContiguousDataPoints (time:TimeSpan) : (seq<DateTime*_> -> _) =
groupBy (fun (t,_) (t',_) -> t' - t <= time)
Okay, trying again. Achieving the optimal amount of laziness turns out to be a bit difficult in F#... On the bright side, this is somewhat more functional than my last attempt, in that it doesn't use any ref cells.
let groupBy cmp (sq:seq<_>) =
let en = sq.GetEnumerator()
let next() = if en.MoveNext() then Some en.Current else None
(* this function returns a pair containing the first sequence and a lazy option indicating the first element in the next sequence (if any) *)
let rec seqStartingWith start =
match next() with
| Some y when cmp start y ->
let rest_next = lazy seqStartingWith y // delay evaluation until forced - stores the rest of this sequence and the start of the next one as a pair
seq { yield start; yield! fst (Lazy.force rest_next) },
lazy Lazy.force (snd (Lazy.force rest_next))
| next -> seq { yield start }, lazy next
let rec iter start =
seq {
match (Lazy.force start) with
| None -> ()
| Some start ->
let (first,next) = seqStartingWith start
yield first
yield! iter next
}
Seq.cache (iter (lazy next()))
Below is some code that does what I think you want. It is not idiomatic F#.
(It may be similar to Brian's answer, though I can't tell because I'm not familiar with the LazyList semantics.)
But it doesn't exactly match your test specification: Seq.length enumerates its entire input. Your "test code" calls Seq.length and then calls Seq.hd. That will generate an enumerator twice, and since there is no caching, things get messed up. I'm not sure if there is any clean way to allow multiple enumerators without caching. Frankly, seq<seq<'a>> may not be the best data structure for this problem.
Anyway, here's the code:
type State<'a> = Unstarted | InnerOkay of 'a | NeedNewInner of 'a | Finished
// f() = true means the neighbors should be kept together
// f() = false means they should be split
let split_up (f : 'a -> 'a -> bool) (input : seq<'a>) =
// simple unfold that assumes f captured a mutable variable
let iter f = Seq.unfold (fun _ ->
match f() with
| Some(x) -> Some(x,())
| None -> None) ()
seq {
let state = ref (Unstarted)
use ie = input.GetEnumerator()
let innerMoveNext() =
match !state with
| Unstarted ->
if ie.MoveNext()
then let cur = ie.Current
state := InnerOkay(cur); Some(cur)
else state := Finished; None
| InnerOkay(last) ->
if ie.MoveNext()
then let cur = ie.Current
if f last cur
then state := InnerOkay(cur); Some(cur)
else state := NeedNewInner(cur); None
else state := Finished; None
| NeedNewInner(last) -> state := InnerOkay(last); Some(last)
| Finished -> None
let outerMoveNext() =
match !state with
| Unstarted | NeedNewInner(_) -> Some(iter innerMoveNext)
| InnerOkay(_) -> failwith "Move to next inner seq when current is active: undefined behavior."
| Finished -> None
yield! iter outerMoveNext }
open System
let groupContigs (contigTime : TimeSpan) (holey : seq<DateTime * int>) =
split_up (fun (t1,_) (t2,_) -> (t2 - t1) <= contigTime) holey
// Test data
let numbers = {1 .. 15}
let contiguousTimeStamps =
let baseTime = DateTime.Now
seq { for n in numbers -> baseTime.AddMinutes(float n)}
let holeyData =
Seq.zip contiguousTimeStamps numbers
|> Seq.filter (fun (dateTime, num) -> num % 7 <> 0)
let grouped_data = groupContigs (new TimeSpan(0,1,0)) holeyData
printfn "Consuming..."
for group in grouped_data do
printfn "about to do a group"
for x in group do
printfn " %A" x
Ok, here's an answer I'm not unhappy with.
(EDIT: I am unhappy - it's wrong! No time to try to fix right now though.)
It uses a bit of imperative state, but it is not too difficult to follow (provided you recall that '!' is the F# dereference operator, and not 'not'). It is as lazy as possible, and takes a seq as input and returns a seq of seqs as output.
let N = 20
let data = // produce some arbitrary data with holes
seq {
for x in 1..N do
if x % 4 <> 0 && x % 7 <> 0 then
printfn "producing %d" x
yield x
}
let rec GroupBy comp (input:seq<_>) = seq {
let doneWithThisGroup = ref false
let areMore = ref true
use e = input.GetEnumerator()
let Next() = areMore := e.MoveNext(); !areMore
// deal with length 0 or 1, seed 'prev'
if not(e.MoveNext()) then () else
let prev = ref e.Current
while !areMore do
yield seq {
while not(!doneWithThisGroup) do
if Next() then
let next = e.Current
doneWithThisGroup := not(comp !prev next)
yield !prev
prev := next
else
// end of list, yield final value
yield !prev
doneWithThisGroup := true }
doneWithThisGroup := false }
let result = data |> GroupBy (fun x y -> y = x + 1)
printfn "Consuming..."
for group in result do
printfn "about to do a group"
for x in group do
printfn " %d" x

Resources