I've had a bug in F# code that I have reduced to the following minimal reproduction sequence, but now I don't understand why it works that way.
let duplicate element =
[ element; element ]
let passThrough (sq: seq<_>) =
use it = sq.GetEnumerator ()
seq {
while (it.MoveNext ()) do
yield it.Current
}
[<EntryPoint>]
let main _ =
[0; 1]
|> Seq.collect (duplicate)
(* |> Seq.toArray // When uncommented - works as expected. *)
|> passThrough
|> Seq.iter (fun i -> printf $"{i} ")
0
When the Seq.toArray call is uncommented, it produces the result I expect, i.e. iterates the sequence pipeline and prints 0 0 1 1. However with that line commented out, the code just finishes without printing anything.
I conferred with one of our experts at F# Slack (thanks R. C.), and was advised that a proper implementation of passThrough should look like this. The enumerator is then properly disposed when the while loop exits. The problem with the original implementation is that the enumerator is disposed at a point in time before the while loop exits, if ever.
let passThrough (sq: seq<_>) =
seq {
use it = sq.GetEnumerator ()
while (it.MoveNext ()) do
yield it.Current
}
The following example is based on a snippet that produces functions that allow enumerating sequence values one by one.
Here printAreEqual () gives true, print2 () gives 12345678910, but print1 () gives 0000000000.
Why cannot the function returned by enumerate return the values of the sequence generated using yield?
open System.Linq
let enumerate (xs: seq<_>) =
use en = xs.GetEnumerator()
fun () ->
en.MoveNext() |> ignore
en.Current
let s1 = seq { for i in 1 .. 10 do yield i }
let s2 = seq { 1 .. 10 }
let f1 = s1 |> enumerate
let f2 = s2 |> enumerate
let printAreEqual () = Enumerable.SequenceEqual (s1, s2) |> printf "%b" // true
let print1 () = for i in 1 .. 10 do f1() |> printf "%i" // 0000000000
let print2 () = for i in 1 .. 10 do f2() |> printf "%i" // 12345678910
The use en = ... in the enumerate function is effectively doing this:
let enumerate (xs: seq<_>) =
let en = xs.GetEnumerator()
let f =
fun () ->
en.MoveNext() |> ignore
en.Current
en.Dispose()
f
You're always disposing of the enumerator before you start using it, so the behaviour is probably undefined in this situation and it doesn't matter why you get different results for two sequences with different implementations.
Fine-grained control of sequence enumeration is always tricky and it's hard to make helper functions for because of the mutable state.
I've got the following (simplified) code:
open System
open System.IO
[<EntryPoint>]
let main argv =
let rec lineseq = seq {
match Console.ReadLine() with
| null -> yield! Seq.empty
| line ->
yield! lineseq
}
0
Visual studio is emitting an "recursive object" warning for the second yield statement, namely yield! lineseq.
Why is this?
This is because you are defining lineseq as a value.
Just write #nowarn "40" at the beginning as the warning suggest, or add a dummy parameter so it becomes a function:
open System
open System.IO
[<EntryPoint>]
let main argv =
let rec lineseq x = seq {
match Console.ReadLine() with
| null -> yield! Seq.empty
| line ->
yield! lineseq x
}
// But then you need to call the function with a dummy argument.
lineseq () |> ignore
0
Also note that the sequence will still not be evaluated, and ReadLine will return no null, I guess you are waiting for an empty line which is "".
Try something like this in order to visualize the results:
let main argv =
let rec lineseq x = seq {
match Console.ReadLine() with
| "" -> yield! Seq.empty
| line -> yield! lineseq x}
lineseq () |> Seq.toList |> ignore
0
It has a ressemblance to this question: Recursive function vs recursive variable in F#
I can never find the source code of the F# core libraries. I know it is supposedly open but google is not kind to me in helping me locate it, if so I would have looked up the impl of Seq.fold - but here goes the question.
Does anybody see any issue with the following snippet:
let success = myList |>
Seq.fold
(fun acc item -> evaluation item)
false
Logically it doesn't seem to hold water and I can and will experiment to test it. But I wanted to ask the community. If any single evaluation inside of myList retruns false, I want the success variable to be false...
So the test:
let myList = [true; true]
let success = List.fold (fun acc item -> acc && item) true myList
and
let myList = [true; false; true]
let success = List.fold (fun acc item -> acc && item) true myList
do return the proper results - I just would be more comfy seeing the source...
I think what you're looking for is something like this:
let success = myList |>
Seq.fold
(fun acc item -> acc && evaluation item)
true
This also offers "short-circut" evaluation so that if acc is false from a previous evaluation, evaluation item won't run and the expression will simply return false.
MSDN documentation for fold operator
Seq.exists will short circuit:
let success =
[1;2;3;40;5;2]
|> Seq.exists (fun item->(item>30))
|> not
I get that this is an old question, but the following may be relevant to those who have a similar question.
About the specific question here
There already exists a function that returns false as soon as one element in a Sequence is false: Seq.forAll.
So the easiest answer to the question is in fact:
let success = Seq.forAll evaluation myList
which is slightly easier to grasp than TechNeilogy’s (rewritten) answer
let success = not (Seq.exists evaluation myList)
Both in the accepted answer by Wesley Wiser and in this answer, the evaluation function is not evaluated on the items after the first item that evaluates to fold.
But, as Pascal Cuoq correctly remarked, in the accepted answer all the elements of the remainder of the list are still iterated over, which is useless.
In contrast, Seq.forAll really stops iterating when there is no use to continue. So do Seq.exists, Seq.takeWhile, …
About short-circuiting a folding in general
There are other cases where one wants to short-circuit a folding. It can be done.
Step 1: Define a folder with some kind of indication that the state won’t change during the traversal the rest of the source sequence, and the folding should be short-circuited.
Step 2: Use Seq.scan instead of Seq.fold.
Seq.scan is like Seq.fold, takes the same arguments, but computes on-demand, and returns not just the final state, but the sequence of all intermediate states and the final state.
It follows that (for finite mySequence): Seq.last (Seq.scan folder initialState mySequence) = Seq.fold folder initialState mySequence
Step 3: Use a short-circuiting function on the output of Seq.scan. Take your pick: Seq.takeWhile, Seq.forall, Seq.exists, …
In the following example, the state becomes None when a duplicate element is found, which means that the scanning may be short-circuited.
let allDistinct mySequence =
let folder state element =
match state with
| Some elementsSoFar when not (Set.contains element elementsSoFar) ->
Some (Set.add element elementsSoFar)
| _ ->
None
let initialState = Some Set.empty
let scanning = Seq.scan folder initialState mySequence
Seq.forall Option.isSome scanning
Hmmmm, I upgraded my Visual Studio and F# recently, and can't seem to locate the directory containing the F# library code. But, for what its worth, Seq.fold is equivalent to the following:
let fold f seed items =
let mutable res = seed
for item in items do
res <- f res item
res
If any single evaluation inside of
myList retruns false, I want the
success variable to be false...
It depends on how your evaluation function is implemented. If you want to return false when any of your items are false, use Seq.forall instead.
something like this
let l = [true; true; true; false; true]
let eval x = x
let x = (true, l) ||> Seq.fold(fun acc item -> acc && (eval item))
or you want to stop evaluation on first false result?
let l = [true; false; true]
l |> Seq.forall id
As for the original source, here are the fold functions from the August 10, 2010 release.
Shouldn't really need to concern yourself over the implementation, but seeing it can often be educational.
// Seq module
let fold<'T,'State> f (x:'State) (source : seq<'T>) =
checkNonNull "source" source
use e = source.GetEnumerator()
let mutable state = x
while e.MoveNext() do
state <- f state e.Current;
state
// Array module
let fold<'T,'State> (f : 'State -> 'T -> 'State) (acc: 'State) (array:'T[]) = //'
checkNonNull "array" array
let f = OptimizedClosures.FSharpFunc<_,_,_>.Adapt(f)
let mutable state = acc
let len = array.Length
for i = 0 to len - 1 do
state <- f.Invoke(state,array.[i])
state
// List module
let fold<'T,'State> f (s:'State) (list: 'T list) =
match list with
| [] -> s
| _ ->
let f = OptimizedClosures.FSharpFunc<_,_,_>.Adapt(f)
let rec loop s xs =
match xs with
| [] -> s
| h::t -> loop (f.Invoke(s,h)) t
loop s list
// MapTree module (Used by Map module)
let rec fold (f:OptimizedClosures.FSharpFunc<_,_,_,_>) x m =
match m with
| MapEmpty -> x
| MapOne(k,v) -> f.Invoke(x,k,v)
| MapNode(k,v,l,r,_) ->
let x = fold f x l
let x = f.Invoke(x,k,v)
fold f x r
// Map module
let fold<'Key,'T,'State when 'Key : comparison> f (z:'State) (m:Map<'Key,'T>) = //'
let f = OptimizedClosures.FSharpFunc<_,_,_,_>.Adapt(f)
MapTree.fold f z m.Tree
// SetTree module (Used by Set module)
let rec fold f x m =
match m with
| SetNode(k,l,r,_) ->
let x = fold f x l in
let x = f x k
fold f x r
| SetOne(k) -> f x k
| SetEmpty -> x
// Set module
let fold<'T,'State when 'T : comparison> f (z:'State) (s : Set<'T>) = //'
SetTree.fold f z s.Tree
Background:
I have a sequence of contiguous, time-stamped data. The data-sequence has gaps in it where the data is not contiguous. I want create a method to split the sequence up into a sequence of sequences so that each subsequence contains contiguous data (split the input-sequence at the gaps).
Constraints:
The return value must be a sequence of sequences to ensure that elements are only produced as needed (cannot use list/array/cacheing)
The solution must NOT be O(n^2), probably ruling out a Seq.take - Seq.skip pattern (cf. Brian's post)
Bonus points for a functionally idiomatic approach (since I want to become more proficient at functional programming), but it's not a requirement.
Method signature
let groupContiguousDataPoints (timeBetweenContiguousDataPoints : TimeSpan) (dataPointsWithHoles : seq<DateTime * float>) : (seq<seq< DateTime * float >>)= ...
On the face of it the problem looked trivial to me, but even employing Seq.pairwise, IEnumerator<_>, sequence comprehensions and yield statements, the solution eludes me. I am sure that this is because I still lack experience with combining F#-idioms, or possibly because there are some language-constructs that I have not yet been exposed to.
// Test data
let numbers = {1.0..1000.0}
let baseTime = DateTime.Now
let contiguousTimeStamps = seq { for n in numbers ->baseTime.AddMinutes(n)}
let dataWithOccationalHoles = Seq.zip contiguousTimeStamps numbers |> Seq.filter (fun (dateTime, num) -> num % 77.0 <> 0.0) // Has a gap in the data every 77 items
let timeBetweenContiguousValues = (new TimeSpan(0,1,0))
dataWithOccationalHoles |> groupContiguousDataPoints timeBetweenContiguousValues |> Seq.iteri (fun i sequence -> printfn "Group %d has %d data-points: Head: %f" i (Seq.length sequence) (snd(Seq.hd sequence)))
I think this does what you want
dataWithOccationalHoles
|> Seq.pairwise
|> Seq.map(fun ((time1,elem1),(time2,elem2)) -> if time2-time1 = timeBetweenContiguousValues then 0, ((time1,elem1),(time2,elem2)) else 1, ((time1,elem1),(time2,elem2)) )
|> Seq.scan(fun (indexres,(t1,e1),(t2,e2)) (index,((time1,elem1),(time2,elem2))) -> (index+indexres,(time1,elem1),(time2,elem2)) ) (0,(baseTime,-1.0),(baseTime,-1.0))
|> Seq.map( fun (index,(time1,elem1),(time2,elem2)) -> index,(time2,elem2) )
|> Seq.filter( fun (_,(_,elem)) -> elem <> -1.0)
|> PSeq.groupBy(fst)
|> Seq.map(snd>>Seq.map(snd))
Thanks for asking this cool question
I translated Alexey's Haskell to F#, but it's not pretty in F#, and still one element too eager.
I expect there is a better way, but I'll have to try again later.
let N = 20
let data = // produce some arbitrary data with holes
seq {
for x in 1..N do
if x % 4 <> 0 && x % 7 <> 0 then
printfn "producing %d" x
yield x
}
let rec GroupBy comp (input:LazyList<'a>) : LazyList<LazyList<'a>> =
LazyList.delayed (fun () ->
match input with
| LazyList.Nil -> LazyList.cons (LazyList.empty()) (LazyList.empty())
| LazyList.Cons(x,LazyList.Nil) ->
LazyList.cons (LazyList.cons x (LazyList.empty())) (LazyList.empty())
| LazyList.Cons(x,(LazyList.Cons(y,_) as xs)) ->
let groups = GroupBy comp xs
if comp x y then
LazyList.consf
(LazyList.consf x (fun () ->
let (LazyList.Cons(firstGroup,_)) = groups
firstGroup))
(fun () ->
let (LazyList.Cons(_,otherGroups)) = groups
otherGroups)
else
LazyList.cons (LazyList.cons x (LazyList.empty())) groups)
let result = data |> LazyList.of_seq |> GroupBy (fun x y -> y = x + 1)
printfn "Consuming..."
for group in result do
printfn "about to do a group"
for x in group do
printfn " %d" x
You seem to want a function that has signature
(`a -> bool) -> seq<'a> -> seq<seq<'a>>
I.e. a function and a sequence, then break up the input sequence into a sequence of sequences based on the result of the function.
Caching the values into a collection that implements IEnumerable would likely be simplest (albeit not exactly purist, but avoiding iterating the input multiple times. It will lose much of the laziness of the input):
let groupBy (fun: 'a -> bool) (input: seq) =
seq {
let cache = ref (new System.Collections.Generic.List())
for e in input do
(!cache).Add(e)
if not (fun e) then
yield !cache
cache := new System.Collections.Generic.List()
if cache.Length > 0 then
yield !cache
}
An alternative implementation could pass cache collection (as seq<'a>) to the function so it can see multiple elements to chose the break points.
A Haskell solution, because I don't know F# syntax well, but it should be easy enough to translate:
type TimeStamp = Integer -- ticks
type TimeSpan = Integer -- difference between TimeStamps
groupContiguousDataPoints :: TimeSpan -> [(TimeStamp, a)] -> [[(TimeStamp, a)]]
There is a function groupBy :: (a -> a -> Bool) -> [a] -> [[a]] in the Prelude:
The group function takes a list and returns a list of lists such that the concatenation of the result is equal to the argument. Moreover, each sublist in the result contains only equal elements. For example,
group "Mississippi" = ["M","i","ss","i","ss","i","pp","i"]
It is a special case of groupBy, which allows the programmer to supply their own equality test.
It isn't quite what we want, because it compares each element in the list with the first element of the current group, and we need to compare consecutive elements. If we had such a function groupBy1, we could write groupContiguousDataPoints easily:
groupContiguousDataPoints maxTimeDiff list = groupBy1 (\(t1, _) (t2, _) -> t2 - t1 <= maxTimeDiff) list
So let's write it!
groupBy1 :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy1 _ [] = [[]]
groupBy1 _ [x] = [[x]]
groupBy1 comp (x : xs#(y : _))
| comp x y = (x : firstGroup) : otherGroups
| otherwise = [x] : groups
where groups#(firstGroup : otherGroups) = groupBy1 comp xs
UPDATE: it looks like F# doesn't let you pattern match on seq, so it isn't too easy to translate after all. However, this thread on HubFS shows a way to pattern match sequences by converting them to LazyList when needed.
UPDATE2: Haskell lists are lazy and generated as needed, so they correspond to F#'s LazyList (not to seq, because the generated data is cached (and garbage collected, of course, if you no longer hold a reference to it)).
(EDIT: This suffers from a similar problem to Brian's solution, in that iterating the outer sequence without iterating over each inner sequence will mess things up badly!)
Here's a solution that nests sequence expressions. The imperitave nature of .NET's IEnumerable<T> is pretty apparent here, which makes it a bit harder to write idiomatic F# code for this problem, but hopefully it's still clear what's going on.
let groupBy cmp (sq:seq<_>) =
let en = sq.GetEnumerator()
let rec partitions (first:option<_>) =
seq {
match first with
| Some first' -> //'
(* The following value is always overwritten;
it represents the first element of the next subsequence to output, if any *)
let next = ref None
(* This function generates a subsequence to output,
setting next appropriately as it goes *)
let rec iter item =
seq {
yield item
if (en.MoveNext()) then
let curr = en.Current
if (cmp item curr) then
yield! iter curr
else // consumed one too many - pass it on as the start of the next sequence
next := Some curr
else
next := None
}
yield iter first' (* ' generate the first sequence *)
yield! partitions !next (* recursively generate all remaining sequences *)
| None -> () // return an empty sequence if there are no more values
}
let first = if en.MoveNext() then Some en.Current else None
partitions first
let groupContiguousDataPoints (time:TimeSpan) : (seq<DateTime*_> -> _) =
groupBy (fun (t,_) (t',_) -> t' - t <= time)
Okay, trying again. Achieving the optimal amount of laziness turns out to be a bit difficult in F#... On the bright side, this is somewhat more functional than my last attempt, in that it doesn't use any ref cells.
let groupBy cmp (sq:seq<_>) =
let en = sq.GetEnumerator()
let next() = if en.MoveNext() then Some en.Current else None
(* this function returns a pair containing the first sequence and a lazy option indicating the first element in the next sequence (if any) *)
let rec seqStartingWith start =
match next() with
| Some y when cmp start y ->
let rest_next = lazy seqStartingWith y // delay evaluation until forced - stores the rest of this sequence and the start of the next one as a pair
seq { yield start; yield! fst (Lazy.force rest_next) },
lazy Lazy.force (snd (Lazy.force rest_next))
| next -> seq { yield start }, lazy next
let rec iter start =
seq {
match (Lazy.force start) with
| None -> ()
| Some start ->
let (first,next) = seqStartingWith start
yield first
yield! iter next
}
Seq.cache (iter (lazy next()))
Below is some code that does what I think you want. It is not idiomatic F#.
(It may be similar to Brian's answer, though I can't tell because I'm not familiar with the LazyList semantics.)
But it doesn't exactly match your test specification: Seq.length enumerates its entire input. Your "test code" calls Seq.length and then calls Seq.hd. That will generate an enumerator twice, and since there is no caching, things get messed up. I'm not sure if there is any clean way to allow multiple enumerators without caching. Frankly, seq<seq<'a>> may not be the best data structure for this problem.
Anyway, here's the code:
type State<'a> = Unstarted | InnerOkay of 'a | NeedNewInner of 'a | Finished
// f() = true means the neighbors should be kept together
// f() = false means they should be split
let split_up (f : 'a -> 'a -> bool) (input : seq<'a>) =
// simple unfold that assumes f captured a mutable variable
let iter f = Seq.unfold (fun _ ->
match f() with
| Some(x) -> Some(x,())
| None -> None) ()
seq {
let state = ref (Unstarted)
use ie = input.GetEnumerator()
let innerMoveNext() =
match !state with
| Unstarted ->
if ie.MoveNext()
then let cur = ie.Current
state := InnerOkay(cur); Some(cur)
else state := Finished; None
| InnerOkay(last) ->
if ie.MoveNext()
then let cur = ie.Current
if f last cur
then state := InnerOkay(cur); Some(cur)
else state := NeedNewInner(cur); None
else state := Finished; None
| NeedNewInner(last) -> state := InnerOkay(last); Some(last)
| Finished -> None
let outerMoveNext() =
match !state with
| Unstarted | NeedNewInner(_) -> Some(iter innerMoveNext)
| InnerOkay(_) -> failwith "Move to next inner seq when current is active: undefined behavior."
| Finished -> None
yield! iter outerMoveNext }
open System
let groupContigs (contigTime : TimeSpan) (holey : seq<DateTime * int>) =
split_up (fun (t1,_) (t2,_) -> (t2 - t1) <= contigTime) holey
// Test data
let numbers = {1 .. 15}
let contiguousTimeStamps =
let baseTime = DateTime.Now
seq { for n in numbers -> baseTime.AddMinutes(float n)}
let holeyData =
Seq.zip contiguousTimeStamps numbers
|> Seq.filter (fun (dateTime, num) -> num % 7 <> 0)
let grouped_data = groupContigs (new TimeSpan(0,1,0)) holeyData
printfn "Consuming..."
for group in grouped_data do
printfn "about to do a group"
for x in group do
printfn " %A" x
Ok, here's an answer I'm not unhappy with.
(EDIT: I am unhappy - it's wrong! No time to try to fix right now though.)
It uses a bit of imperative state, but it is not too difficult to follow (provided you recall that '!' is the F# dereference operator, and not 'not'). It is as lazy as possible, and takes a seq as input and returns a seq of seqs as output.
let N = 20
let data = // produce some arbitrary data with holes
seq {
for x in 1..N do
if x % 4 <> 0 && x % 7 <> 0 then
printfn "producing %d" x
yield x
}
let rec GroupBy comp (input:seq<_>) = seq {
let doneWithThisGroup = ref false
let areMore = ref true
use e = input.GetEnumerator()
let Next() = areMore := e.MoveNext(); !areMore
// deal with length 0 or 1, seed 'prev'
if not(e.MoveNext()) then () else
let prev = ref e.Current
while !areMore do
yield seq {
while not(!doneWithThisGroup) do
if Next() then
let next = e.Current
doneWithThisGroup := not(comp !prev next)
yield !prev
prev := next
else
// end of list, yield final value
yield !prev
doneWithThisGroup := true }
doneWithThisGroup := false }
let result = data |> GroupBy (fun x y -> y = x + 1)
printfn "Consuming..."
for group in result do
printfn "about to do a group"
for x in group do
printfn " %d" x