How to improve this F# function - f#

I am experienced in C# but new to F# and Functional Programming. Now I am trying to implement a class library in F#. Here is one of the functions:
It takes a list of integers <=9 and change consecutive 9 like 9,9,9,9 to 9, 10, 11, 12. For example [9;9;9;1;4;0;1;9;9;9;9] will be changed to [9; 10; 11; 1; 4; 0; 1; 9; 10; 11; 12].
C# function is trivial:
void ReleaseCap(List<int> items)
{
for (int i = 1; i < items.Count; i++)
{
var current = items[i];
var previous = items[i - 1];
//If curernt value = 9 and previous >=9, then current value should be previous+1
if (current == 9 && previous >= 9)
{
items[i] = previous + 1;
}
}
}
Now is my F# tail recursive one. Instead of loop the List by index, it recursively move item from an initial list to an processed list until everything in the initial list is gone:
let releaseCap items =
let rec loop processed remaining = //tail recursion
match remaining with
| [] -> processed //if nothing left, the job is done.
| current :: rest when current = 9 -> //if current item =9
match processed with
// previous value >= 9, then append previous+1 to the processed list
| previous :: _ when previous >= 9 -> loop (previous+1 :: processed) rest
//if previous < 9, the current one should be just 9
| _ -> loop (current :: processed) rest
//otherwise, just put the current value to the processed list
| current :: rest -> loop (current :: processed) rest
loop [] items |> List.rev
While the C# version is trivial and intuitive, the F# code is verbose and not as intuitive. Is there any part of the F# code can be improved to make it more elegant?

You can reuse existing functions in order to simplify your code.
Normally when you change items in a list you think of a map but in this case you have something to remember from your previous computation which should be passed for each item, so you should aim to fold related functions.
Here's one: List.scan
let releaseCap items =
items
|> List.scan (fun previous current ->
if current = 9 && previous >= 9 then previous + 1
else current) 0
|> List.tail
FP is not just about using recursion instead of loops. Recursion is typically used in basic and reusable functions, then by combining those functions you can solve complex problems.
NOTE: You are comparing your C# solution with your F# solution, but did you notice that apart from the language there is an important difference between both solutions? Your C# solution uses mutability.

In your C# code you correctly assume that the 1st element of the list will never change and start with the 2nd element.
If you include this in the F# code you can skip the match on accumulator.
The result will be a bit simpler.
let reduceCap l =
let rec reduceCapRec acc l =
let previous = List.head acc
match l with
| x::xs ->
if x = 9 && previous >= 9
then reduceCapRec ((previous+1) :: acc) xs
else reduceCapRec (x :: acc) xs
| [] -> acc
reduceCapRec [List.head l] (List.tail l)
|> List.rev
(Although Gustavo's solution is still much better - I'm also new to FP)

I thought fold was the important one! :-)
So:
let list = [9;9;9;1;4;0;1;9;9;9;9]
let incr acc e =
match acc, e with
| h::t, e when e = 9 && h >= 9 -> (h + 1)::acc
| _ -> e::acc
list
|> List.fold incr []
|> List.rev
//val it : int list = [9; 10; 11; 1; 4; 0; 1; 9; 10; 11; 12]

Related

F# sorting issue

Whats wrong with this code? Why wont it sort?
let rec sort = function
| [] -> []
| [x] -> [x]
| x1::x2::xs -> if x1 <= x2 then x1 :: sort (x2::xs)
else x2 :: sort (x1::xs)
Suppost to take
sort [3;1;4;1;5;9;2;6;5];;
and return:
val it : int list = [1; 1; 2; 3; 4; 5; 5; 6; 9]
Your code is like one bubble cycle in bubble sort. It will take max element to the last position and some other bigger elements to the right.
Note that you are only traversing original list only once. It has linear complexity and we all know that sorting using only comparison has to be O(n*log n).
You can repeat that cycle more times if you want to sort this list.
think about what can end up in the first position ... right now it can only be the first or the second position in your list (the if inside the last case)
let rec ssort = function
[] -> []
| x::xs ->
let min, rest =
List.fold_left (fun (min,acc) x ->
if h<min then (h, min::acc)
else (min, h::acc))
(x, []) xs
in min::ssort rest

Converting a loop to pure functions

I have this code written for a Project Euler problem in c++:
int sum = 0;
for(int i =0; i < 1000; i++)
{
//Check if multiple of 3 but not multiple of 5 to prevent duplicate
sum += i % 3 == 0 && i % 5 != 0 ? i: 0;
//check for all multiple of 5, including those of 3
sum += i % 5 == 0 ? i: 0;
}
cout << sum;
I'm trying to learn f# and rewriting this in f#. This is what I have so far:
open System
//function to calculate the multiples
let multiple3v5 num =
num
//function to calculate sum of list items
let rec SumList xs =
match xs with
| [] -> 0
| y::ys -> y + SumList ys
let sum = Array.map multiple3v5 [|1 .. 1000|]
What I have may be complete nonsense...so help please?
Your sumList function is a good start. It already iterates (recursively) over the entire list, so you don't need to wrap it in an additional Array.map. You just need to extend your sumList so that it adds the number only when it matches the specified condition.
Here is a solution to a simplified problem - add all numbers that are divisible by 3:
open System
let rec sumList xs =
match xs with
| [] -> 0 // If the list is empty, the sum is zero
| y::ys when y % 3 = 0 ->
// If the list starts with y that is divisible by 3, then we add 'y' to the
// sum that we get by recursively processing the rest of the list
y + sumList ys
| y::ys ->
// This will only execute when y is not divisible by 3, so we just
// recursively process the rest of the list and return
/// that (without adding current value)
sumList ys
// For testing, let's sum all numbers divisble by 3 between 1 and 10.
let sum = sumList [ 1 .. 10 ]
This is the basic way of writing the function using explicit recursion. In practice, the solution by jpalmer is how I'd solve it too, but it is useful to write a few recursive functions yourself if you're learning F#.
The accumulator parameter mentioned by sashang is a more advanced way to write this. You'll need to do that if you want to run the function on large inputs (which is likely the case in Euler problem). When using accumulator parameter, the function can be written using tail recursion, so it avoids stack overflow even when processing long lists.
The idea of a accumulator-based version is that the function takes additional parameter, which represents the sum calculated so far.
let rec sumList xs sumSoFar = ...
When you call it initially, you write sumList [ ... ] 0. The recursive calls will not call y + sumList xs, but will instead add y to the accumulator and then make the recursive call sumList xs (y + sumSoFar). This way, the F# compiler can do tail-call optimization and it will translate code to a loop (similar to the C++ version).
I'm not sure if translating from an imperative language solution is a good approach to developing a functional mindset as instrument (C++ in your case) had already defined an (imperative) approach to solution, so it's better sticking to original problem outlay.
Overall tasks from Project Euler are excellent for mastering many F# facilities. For example, you may use list comprehensions like in the snippet below
// multipleOf3Or5 function definition is left for your exercise
let sumOfMultiples n =
[ for x in 1 .. n do if multipleOf3Or5 x then yield x] |> List.sum
sumOfMultiples 999
or you can a bit generalize the solution suggested by #jpalmer by exploiting laziness:
Seq.initInfinite id
|> Seq.filter multipleOf3Or5
|> Seq.takeWhile ((>) 1000)
|> Seq.sum
or you may even use this opportunity to master active patterns:
let (|DivisibleBy|_) divisior num = if num % divisor = 0 the Some(num) else None
{1..999}
|> Seq.map (fun i ->
match i with | DivisibleBy 3 i -> i | DivisibleBy 5 i -> i | _ -> 0)
|> Seq.sum
All three variations above implement a common pattern of making a sequence of members with sought property and then folding it by calculating sum.
F# has many more functions than just map - this problem suggests using filter and sum, my approach would be something like
let valid n = Left as an exercise
let r =
[1..1000]
|> List.filter valid
|> List.sum
printfn "%i" r
I didn't want to do the whole problem, but filling in the missing function shouldn't be too hard
This is how you turn a loop with a counter into a recursive function. You do this by passing an accumulator parameter to the loop function that holds the current loop count.
For example:
let rec loop acc =
if acc = 10 then
printfn "endloop"
else
printfn "%d" acc
loop (acc + 1)
loop 0
This will stop when acc is 10.

In F#, is there a functional way to converting a flat array of items into an array of a group of items?

In F#, imagine we have an array of bytes representing pixel data with three bytes per pixel in RGB order:
[| 255; 0; 0; //Solid red
0; 255; 0; //Solid green
0; 0; 255; //Solid blue
1; 72; 9;
34; 15; 155
... |]
I'm having a hard time knowing how to functionally operate on this data as-is, since a single item is really a consecutive block of three elements in the array.
So, I need to first group the triples in the array into something like this:
[|
[| 255; 0; 0 |];
[| 0; 255; 0 |];
[| 0; 0; 255 |];
[| 1; 72; 9 |];
[| 34; 15; 155 |]
... |]
Now, gathering up the triples into sub-arrays is easy enough to do with a for loop, but I'm curious--is there a functional way to gather up groups of array elements in F#? My ultimate goal is not simply to convert the data as illustrated above, but to solve the problem in a more declarative and functional manner. But I have yet to find an example of how to do this without an imperative loop.
kvb's answer may not give you what you want. Seq.windowed returns a sliding window of values, e.g., [1; 2; 3; 4] becomes [[1; 2; 3]; [2; 3; 4]]. It seems like you want it split into contiguous chunks. The following function takes a list and returns a list of triples ('T list -> ('T * 'T * 'T) list).
let toTriples list =
let rec aux f = function
| a :: b :: c :: rest -> aux (fun acc -> f ((a, b, c) :: acc)) rest
| _ -> f []
aux id list
Here's the inverse:
let ofTriples triples =
let rec aux f = function
| (a, b, c) :: rest -> aux (fun acc -> f (a :: b :: c :: acc)) rest
| [] -> f []
aux id triples
EDIT
If you're dealing with huge amounts of data, here's a sequence-based approach with constant memory use (all the options and tuples it creates have a negative impact on GC--see below for a better version):
let (|Next|_|) (e:IEnumerator<_>) =
if e.MoveNext() then Some e.Current
else None
let (|Triple|_|) = function
| Next a & Next b & Next c -> Some (a, b, c) //change to [|a;b;c|] if you like
| _ -> None
let toSeqTriples (items:seq<_>) =
use e = items.GetEnumerator()
let rec loop() =
seq {
match e with
| Triple (a, b, c) ->
yield a, b, c
yield! loop()
| _ -> ()
}
loop()
EDIT 2
ebb's question about memory use prompted me to test and I found toSeqTriples to be slow and cause surprisingly frequent GCs. The following version fixes those issues and is almost 4x faster than the list-based version.
let toSeqTriplesFast (items:seq<_>) =
use e = items.GetEnumerator()
let rec loop() =
seq {
if e.MoveNext() then
let a = e.Current
if e.MoveNext() then
let b = e.Current
if e.MoveNext() then
let c = e.Current
yield (a, b, c)
yield! loop()
}
loop()
This has relatively constant memory usage vs a list or array-based approach because a) if you have a seq to start with the entire sequence doesn't have to be slurped into a list/array; and b) it also returns a sequence, making it lazy, and avoiding allocating yet another list/array.
I need to first group the triples in the array into something like this:
If you know they will always be triples then representing then as a tuple int * int * int is more "typeful" than using an array because it conveys the fact that there are only ever exactly three elements.
Other people have described various ways to massage the data but I would actually recommend not bothering (unless there is more to this than you have described). I would opt for a function to destructure your array as-is instead:
let get i = a.[3*i], a.[3*i+1], a.[3*i+2]
If you really want to change the representation then you can now do:
let b = Array.init (a.Length/3) get
The answer really depends upon what you want to do next though...
(Hat tip: Scott Wlaschin) As of F# 4.0, you can use Array.chunkBySize(). It does exactly what you want:
let bs = [| 255; 0; 0; //Solid red
0; 255; 0; //Solid green
0; 0; 255; //Solid blue
1; 72; 9;
34; 15; 155 |]
let grouped = bs |> Array.chunkBySize 3
// [| [|255; 0; 0|]
// [| 0; 255; 0|]
// [| 0; 0; 255|]
// [| 1; 72; 9|]
// [| 34; 15; 155|] |]
The List and Seq modules also have chunkBySize() in F# 4.0. As of this writing, the docs at MSDN don't show chunkBySize() anywhere, but it's there if you're using F# 4.0.
UPDATE: As pointed out by Daniel, this answer is wrong because it creates a sliding window.
You can use the Seq.windowed function from the library. E.g.
let rgbPix = rawValues |> Seq.windowed 3
This returns a sequence rather than an array, so if you need random access, you could follow that with a call to Seq.toArray.
Another approach, that takes and yields arrays directly:
let splitArrays n arr =
match Array.length arr with
| 0 ->
invalidArg "arr" "array is empty"
| x when x % n <> 0 ->
invalidArg "arr" "array length is not evenly divisible by n"
| arrLen ->
let ret = arrLen / n |> Array.zeroCreate
let rec loop idx =
ret.[idx] <- Array.sub arr (idx * n) n
match idx + 1 with
| idx' when idx' <> ret.Length -> loop idx'
| _ -> ret
loop 0
Or, yet another:
let splitArray n arr =
match Array.length arr with
| 0 ->
invalidArg "arr" "array is empty"
| x when x % n <> 0 ->
invalidArg "arr" "array length is not evenly divisible by n"
| arrLen ->
let rec loop idx = seq {
yield Array.sub arr idx n
let idx' = idx + n
if idx' <> arrLen then
yield! loop idx' }
loop 0 |> Seq.toArray

Remove a single non-unique value from a sequence in F#

I have a sequence of integers representing dice in F#.
In the game in question, the player has a pool of dice and can choose to play one (governed by certain rules) and keep the rest.
If, for example, a player rolls a 6, 6 and a 4 and decides to play one the sixes, is there a simple way to return a sequence with only one 6 removed?
Seq.filter (fun x -> x != 6) dice
removes all of the sixes, not just one.
Non-trivial operations on sequences are painful to work with, since they don't support pattern matching. I think the simplest solution is as follows:
let filterFirst f s =
seq {
let filtered = ref false
for a in s do
if filtered.Value = false && f a then
filtered := true
else yield a
}
So long as the mutable implementation is hidden from the client, it's still functional style ;)
If you're going to store data I would use ResizeArray instead of a Sequence. It has a wealth of functions built in such as the function you asked about. It's simply called Remove. Note: ResizeArray is an abbreviation for the CLI type List.
let test = seq [1; 2; 6; 6; 1; 0]
let a = new ResizeArray<int>(test)
a.Remove 6 |> ignore
Seq.toList a |> printf "%A"
// output
> [1; 2; 6; 1; 0]
Other data type options could be Array
let removeOneFromArray v a =
let i = Array.findIndex ((=)v) a
Array.append a.[..(i-1)] a.[(i+1)..]
or List
let removeOneFromList v l =
let rec remove acc = function
| x::xs when x = v -> List.rev acc # xs
| x::xs -> remove (x::acc) xs
| [] -> acc
remove [] l
the below code will work for a list (so not any seq but it sounds like the sequence your using could be a List)
let rec removeOne value list =
match list with
| head::tail when head = value -> tail
| head::tail -> head::(removeOne value tail)
| _ -> [] //you might wanna fail here since it didn't find value in
//the list
EDIT: code updated based on correct comment below. Thanks P
EDIT: After reading a different answer I thought that a warning would be in order. Don't use the above code for infite sequences but since I guess your players don't have infite dice that should not be a problem but for but for completeness here's an implementation that would work for (almost) any
finite sequence
let rec removeOne value seq acc =
match seq.Any() with
| true when s.First() = value -> seq.Skip(1)
| true -> seq.First()::(removeOne value seq.Skip(1))
| _ -> List.rev acc //you might wanna fail here since it didn't find value in
//the list
However I recommend using the first solution which Im confident will perform better than the latter even if you have to turn a sequence into a list first (at least for small sequences or large sequences with the soughtfor value in the end)
I don't think there is any function that would allow you to directly represent the idea that you want to remove just the first element matching the specified criteria from the list (e.g. something like Seq.removeOne).
You can implement the function in a relatively readable way using Seq.fold (if the sequence of numbers is finite):
let removeOne f l =
Seq.fold (fun (removed, res) v ->
if removed then true, v::res
elif f v then true, res
else false, v::res) (false, []) l
|> snd |> List.rev
> removeOne (fun x -> x = 6) [ 1; 2; 6; 6; 1 ];
val it : int list = [1; 2; 6; 1]
The fold function keeps some state - in this case of type bool * list<'a>. The Boolean flag represents whether we already removed some element and the list is used to accumulate the result (which has to be reversed at the end of processing).
If you need to do this for (possibly) infinite seq<int>, then you'll need to use GetEnumerator directly and implement the code as a recursive sequence expression. This is a bit uglier and it would look like this:
let removeOne f (s:seq<_>) =
// Get enumerator of the input sequence
let en = s.GetEnumerator()
let rec loop() = seq {
// Move to the next element
if en.MoveNext() then
// Is this the element to skip?
if f en.Current then
// Yes - return all remaining elements without filtering
while en.MoveNext() do
yield en.Current
else
// No - return this element and continue looping
yield en.Current
yield! loop() }
loop()
You can try this:
let rec removeFirstOccurrence item screened items =
items |> function
| h::tail -> if h = item
then screened # tail
else tail |> removeFirstOccurrence item (screened # [h])
| _ -> []
Usage:
let updated = products |> removeFirstOccurrence product []

F#: How do i split up a sequence into a sequence of sequences

Background:
I have a sequence of contiguous, time-stamped data. The data-sequence has gaps in it where the data is not contiguous. I want create a method to split the sequence up into a sequence of sequences so that each subsequence contains contiguous data (split the input-sequence at the gaps).
Constraints:
The return value must be a sequence of sequences to ensure that elements are only produced as needed (cannot use list/array/cacheing)
The solution must NOT be O(n^2), probably ruling out a Seq.take - Seq.skip pattern (cf. Brian's post)
Bonus points for a functionally idiomatic approach (since I want to become more proficient at functional programming), but it's not a requirement.
Method signature
let groupContiguousDataPoints (timeBetweenContiguousDataPoints : TimeSpan) (dataPointsWithHoles : seq<DateTime * float>) : (seq<seq< DateTime * float >>)= ...
On the face of it the problem looked trivial to me, but even employing Seq.pairwise, IEnumerator<_>, sequence comprehensions and yield statements, the solution eludes me. I am sure that this is because I still lack experience with combining F#-idioms, or possibly because there are some language-constructs that I have not yet been exposed to.
// Test data
let numbers = {1.0..1000.0}
let baseTime = DateTime.Now
let contiguousTimeStamps = seq { for n in numbers ->baseTime.AddMinutes(n)}
let dataWithOccationalHoles = Seq.zip contiguousTimeStamps numbers |> Seq.filter (fun (dateTime, num) -> num % 77.0 <> 0.0) // Has a gap in the data every 77 items
let timeBetweenContiguousValues = (new TimeSpan(0,1,0))
dataWithOccationalHoles |> groupContiguousDataPoints timeBetweenContiguousValues |> Seq.iteri (fun i sequence -> printfn "Group %d has %d data-points: Head: %f" i (Seq.length sequence) (snd(Seq.hd sequence)))
I think this does what you want
dataWithOccationalHoles
|> Seq.pairwise
|> Seq.map(fun ((time1,elem1),(time2,elem2)) -> if time2-time1 = timeBetweenContiguousValues then 0, ((time1,elem1),(time2,elem2)) else 1, ((time1,elem1),(time2,elem2)) )
|> Seq.scan(fun (indexres,(t1,e1),(t2,e2)) (index,((time1,elem1),(time2,elem2))) -> (index+indexres,(time1,elem1),(time2,elem2)) ) (0,(baseTime,-1.0),(baseTime,-1.0))
|> Seq.map( fun (index,(time1,elem1),(time2,elem2)) -> index,(time2,elem2) )
|> Seq.filter( fun (_,(_,elem)) -> elem <> -1.0)
|> PSeq.groupBy(fst)
|> Seq.map(snd>>Seq.map(snd))
Thanks for asking this cool question
I translated Alexey's Haskell to F#, but it's not pretty in F#, and still one element too eager.
I expect there is a better way, but I'll have to try again later.
let N = 20
let data = // produce some arbitrary data with holes
seq {
for x in 1..N do
if x % 4 <> 0 && x % 7 <> 0 then
printfn "producing %d" x
yield x
}
let rec GroupBy comp (input:LazyList<'a>) : LazyList<LazyList<'a>> =
LazyList.delayed (fun () ->
match input with
| LazyList.Nil -> LazyList.cons (LazyList.empty()) (LazyList.empty())
| LazyList.Cons(x,LazyList.Nil) ->
LazyList.cons (LazyList.cons x (LazyList.empty())) (LazyList.empty())
| LazyList.Cons(x,(LazyList.Cons(y,_) as xs)) ->
let groups = GroupBy comp xs
if comp x y then
LazyList.consf
(LazyList.consf x (fun () ->
let (LazyList.Cons(firstGroup,_)) = groups
firstGroup))
(fun () ->
let (LazyList.Cons(_,otherGroups)) = groups
otherGroups)
else
LazyList.cons (LazyList.cons x (LazyList.empty())) groups)
let result = data |> LazyList.of_seq |> GroupBy (fun x y -> y = x + 1)
printfn "Consuming..."
for group in result do
printfn "about to do a group"
for x in group do
printfn " %d" x
You seem to want a function that has signature
(`a -> bool) -> seq<'a> -> seq<seq<'a>>
I.e. a function and a sequence, then break up the input sequence into a sequence of sequences based on the result of the function.
Caching the values into a collection that implements IEnumerable would likely be simplest (albeit not exactly purist, but avoiding iterating the input multiple times. It will lose much of the laziness of the input):
let groupBy (fun: 'a -> bool) (input: seq) =
seq {
let cache = ref (new System.Collections.Generic.List())
for e in input do
(!cache).Add(e)
if not (fun e) then
yield !cache
cache := new System.Collections.Generic.List()
if cache.Length > 0 then
yield !cache
}
An alternative implementation could pass cache collection (as seq<'a>) to the function so it can see multiple elements to chose the break points.
A Haskell solution, because I don't know F# syntax well, but it should be easy enough to translate:
type TimeStamp = Integer -- ticks
type TimeSpan = Integer -- difference between TimeStamps
groupContiguousDataPoints :: TimeSpan -> [(TimeStamp, a)] -> [[(TimeStamp, a)]]
There is a function groupBy :: (a -> a -> Bool) -> [a] -> [[a]] in the Prelude:
The group function takes a list and returns a list of lists such that the concatenation of the result is equal to the argument. Moreover, each sublist in the result contains only equal elements. For example,
group "Mississippi" = ["M","i","ss","i","ss","i","pp","i"]
It is a special case of groupBy, which allows the programmer to supply their own equality test.
It isn't quite what we want, because it compares each element in the list with the first element of the current group, and we need to compare consecutive elements. If we had such a function groupBy1, we could write groupContiguousDataPoints easily:
groupContiguousDataPoints maxTimeDiff list = groupBy1 (\(t1, _) (t2, _) -> t2 - t1 <= maxTimeDiff) list
So let's write it!
groupBy1 :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy1 _ [] = [[]]
groupBy1 _ [x] = [[x]]
groupBy1 comp (x : xs#(y : _))
| comp x y = (x : firstGroup) : otherGroups
| otherwise = [x] : groups
where groups#(firstGroup : otherGroups) = groupBy1 comp xs
UPDATE: it looks like F# doesn't let you pattern match on seq, so it isn't too easy to translate after all. However, this thread on HubFS shows a way to pattern match sequences by converting them to LazyList when needed.
UPDATE2: Haskell lists are lazy and generated as needed, so they correspond to F#'s LazyList (not to seq, because the generated data is cached (and garbage collected, of course, if you no longer hold a reference to it)).
(EDIT: This suffers from a similar problem to Brian's solution, in that iterating the outer sequence without iterating over each inner sequence will mess things up badly!)
Here's a solution that nests sequence expressions. The imperitave nature of .NET's IEnumerable<T> is pretty apparent here, which makes it a bit harder to write idiomatic F# code for this problem, but hopefully it's still clear what's going on.
let groupBy cmp (sq:seq<_>) =
let en = sq.GetEnumerator()
let rec partitions (first:option<_>) =
seq {
match first with
| Some first' -> //'
(* The following value is always overwritten;
it represents the first element of the next subsequence to output, if any *)
let next = ref None
(* This function generates a subsequence to output,
setting next appropriately as it goes *)
let rec iter item =
seq {
yield item
if (en.MoveNext()) then
let curr = en.Current
if (cmp item curr) then
yield! iter curr
else // consumed one too many - pass it on as the start of the next sequence
next := Some curr
else
next := None
}
yield iter first' (* ' generate the first sequence *)
yield! partitions !next (* recursively generate all remaining sequences *)
| None -> () // return an empty sequence if there are no more values
}
let first = if en.MoveNext() then Some en.Current else None
partitions first
let groupContiguousDataPoints (time:TimeSpan) : (seq<DateTime*_> -> _) =
groupBy (fun (t,_) (t',_) -> t' - t <= time)
Okay, trying again. Achieving the optimal amount of laziness turns out to be a bit difficult in F#... On the bright side, this is somewhat more functional than my last attempt, in that it doesn't use any ref cells.
let groupBy cmp (sq:seq<_>) =
let en = sq.GetEnumerator()
let next() = if en.MoveNext() then Some en.Current else None
(* this function returns a pair containing the first sequence and a lazy option indicating the first element in the next sequence (if any) *)
let rec seqStartingWith start =
match next() with
| Some y when cmp start y ->
let rest_next = lazy seqStartingWith y // delay evaluation until forced - stores the rest of this sequence and the start of the next one as a pair
seq { yield start; yield! fst (Lazy.force rest_next) },
lazy Lazy.force (snd (Lazy.force rest_next))
| next -> seq { yield start }, lazy next
let rec iter start =
seq {
match (Lazy.force start) with
| None -> ()
| Some start ->
let (first,next) = seqStartingWith start
yield first
yield! iter next
}
Seq.cache (iter (lazy next()))
Below is some code that does what I think you want. It is not idiomatic F#.
(It may be similar to Brian's answer, though I can't tell because I'm not familiar with the LazyList semantics.)
But it doesn't exactly match your test specification: Seq.length enumerates its entire input. Your "test code" calls Seq.length and then calls Seq.hd. That will generate an enumerator twice, and since there is no caching, things get messed up. I'm not sure if there is any clean way to allow multiple enumerators without caching. Frankly, seq<seq<'a>> may not be the best data structure for this problem.
Anyway, here's the code:
type State<'a> = Unstarted | InnerOkay of 'a | NeedNewInner of 'a | Finished
// f() = true means the neighbors should be kept together
// f() = false means they should be split
let split_up (f : 'a -> 'a -> bool) (input : seq<'a>) =
// simple unfold that assumes f captured a mutable variable
let iter f = Seq.unfold (fun _ ->
match f() with
| Some(x) -> Some(x,())
| None -> None) ()
seq {
let state = ref (Unstarted)
use ie = input.GetEnumerator()
let innerMoveNext() =
match !state with
| Unstarted ->
if ie.MoveNext()
then let cur = ie.Current
state := InnerOkay(cur); Some(cur)
else state := Finished; None
| InnerOkay(last) ->
if ie.MoveNext()
then let cur = ie.Current
if f last cur
then state := InnerOkay(cur); Some(cur)
else state := NeedNewInner(cur); None
else state := Finished; None
| NeedNewInner(last) -> state := InnerOkay(last); Some(last)
| Finished -> None
let outerMoveNext() =
match !state with
| Unstarted | NeedNewInner(_) -> Some(iter innerMoveNext)
| InnerOkay(_) -> failwith "Move to next inner seq when current is active: undefined behavior."
| Finished -> None
yield! iter outerMoveNext }
open System
let groupContigs (contigTime : TimeSpan) (holey : seq<DateTime * int>) =
split_up (fun (t1,_) (t2,_) -> (t2 - t1) <= contigTime) holey
// Test data
let numbers = {1 .. 15}
let contiguousTimeStamps =
let baseTime = DateTime.Now
seq { for n in numbers -> baseTime.AddMinutes(float n)}
let holeyData =
Seq.zip contiguousTimeStamps numbers
|> Seq.filter (fun (dateTime, num) -> num % 7 <> 0)
let grouped_data = groupContigs (new TimeSpan(0,1,0)) holeyData
printfn "Consuming..."
for group in grouped_data do
printfn "about to do a group"
for x in group do
printfn " %A" x
Ok, here's an answer I'm not unhappy with.
(EDIT: I am unhappy - it's wrong! No time to try to fix right now though.)
It uses a bit of imperative state, but it is not too difficult to follow (provided you recall that '!' is the F# dereference operator, and not 'not'). It is as lazy as possible, and takes a seq as input and returns a seq of seqs as output.
let N = 20
let data = // produce some arbitrary data with holes
seq {
for x in 1..N do
if x % 4 <> 0 && x % 7 <> 0 then
printfn "producing %d" x
yield x
}
let rec GroupBy comp (input:seq<_>) = seq {
let doneWithThisGroup = ref false
let areMore = ref true
use e = input.GetEnumerator()
let Next() = areMore := e.MoveNext(); !areMore
// deal with length 0 or 1, seed 'prev'
if not(e.MoveNext()) then () else
let prev = ref e.Current
while !areMore do
yield seq {
while not(!doneWithThisGroup) do
if Next() then
let next = e.Current
doneWithThisGroup := not(comp !prev next)
yield !prev
prev := next
else
// end of list, yield final value
yield !prev
doneWithThisGroup := true }
doneWithThisGroup := false }
let result = data |> GroupBy (fun x y -> y = x + 1)
printfn "Consuming..."
for group in result do
printfn "about to do a group"
for x in group do
printfn " %d" x

Resources