Why does this definition returns a function? - f#

I found the following in the book Expert F# 4.0, Fourth Edition, by Don Syme, Adam Granicz, and Antonio Cisternino:
let generateStamp =
let mutable count = 0
(fun () -> count <- count + 1; count)
I could not understand why this code creates a function:
val generateStamp : (unit -> int)
It looks to me like its signature should be
val generateStamp : int
For example, the following code:
let gS =
let mutable count = 0
(printfn "%d" count; count)
creates an int value:
val gS : int = 0
As I understand it the code (fun () -> count <- count + 1; count) should first evaluate the lambda and then count. So the value of generateStamp should be just count, as it is in the definition of gS. What am I missing?

In any block of F# code, the last expression in that block will be the value of that block. A block can be defined in one of two ways: by indentation, or with ; between the block's expressions.
The expression fun () -> other expressions here creates a function. Since that's the last expression in the code block under let generateStamp =, that's the value that gets stored in generateStamp.
Your confusion is that you think that the expressions inside the fun () are going to be evaluated immediately as part of the value of generateStamp, but they're not. They are defining the body of the anonymous function returned by the fun () expression. You're absolutely right that inside that block of code, count is the last expression and so it's the thing returned by that function. But the fun () expression creates a function, which will only evaluate its contents later when it is called. It does not evaluate its contents immediately.
By contrast, the expression (printfn "%d" count; count) is a block of code with two expressions in it. It is not a function, so it will be immediately evaluated. Its last expression is count, so the value of the code block (printfn "%d" count; count) is count. Since the (printfn "%d" count; count) block is being evaluated immediately, you can mentally replace it with count. And so the value of gS is count, whereas the value of generateStamp is a function that will return count when it's evaluated.

It's syntactic trickery. The last ; count part is actually part of the lambda, not the next expression after it.
Here are some simplified examples to work through:
let x = 1; 2; 3 // x = 3
let f x = 1; 2; 3 // f is a function
let y = f 5 // y = 3, result of calling function "f"
let f = fun x -> 1; 2; 3 // Equivalent to the previous definition of "f"
let y = f 5 // y = 3, same as above
let f =
fun x -> 1; 2; 3 // Still equivalent
let y = f 5 // y = 3, same as above
let f =
let z = 5
fun x -> 1; 2; 3 // Still equivalent
let y = f 5 // y = 3, same as above
// Your original example. See the similarity?
let generateStamp =
let mutable count = 0
fun () -> count <- count + 1; count
Now, if you wanted to have count be the return value of generateStamp, you'd need to put it either outside the parens or on the next line:
// The following two definitions will make "generateStamp" have type "int"
let generateStamp =
let mutable count = 0
(fun () -> count <- count + 1); count
let generateStamp =
let mutable count = 0
(fun () -> count <- count + 1)
count

Related

Write a for loop that increments the index twice

In F# the documentation provides two standard for loops. The for to expression is the loop which provides an index, incremented or decremented per item, depending on whether it is a for to or for downto expression.
I want to loop over an array and increment a variable amount of times; specifically twice. in C# this is very straight forward:
for(int i = 0; i < somelength; i += 2) { ... }
How would I achieve the same thing in F#?
You can specify the step using the following syntax:
for x in 0 .. 2 .. somelength do
printfn "%d" x
For more information, see the documentation for the for .. in expression. More generally, you can also use this for iterating over any sequence (IEnumerable), so this behaves more like C# foreach.
Tomas answer is correct and elegant it is worth considering that a in F# loop with an increment of 2 is slower than a loop with increment of 1.
Faster loops in F#:
let print x = printfn "%A" x
// Only increment by +1/-1 allowed for ints
let case0 () = for x = 0 to 10 do print x
let case1 () = for x = 10 downto 0 do print x
// Special handling in F# compiler ensures these are fast
let case2 () = for x in 0..10 do print x
let case3 (vs : int array) = for x in vs do print x
let case4 (vs : int list) = for x in vs do print x
let case5 (vs : string) = for x in vs do print x
Slower loops in F#:
let print x = printfn "%A" x
// Not int32s
let case0 () = for x in 0L..10L do print x
let case1 () = for x in 0s..10s do print x
let case2 () = for x in 0.0..10.0 do print x
// Not implicit +1/-1 increment
let case3 () = for x in 0..1..10 do print x
let case4 () = for x in 10..-1..0 do print x
let case5 () = for x in 0..2..10 do print x
let case6 () = for x in 10..-2..0 do print x
// Falls back on seq for all cases except arrays, lists and strings
let case7 (vs : int seq) = for x in vs do print x
let case8 (vs : int ResizeArray) = for x in vs do print x
// Very close to fast case 2 but creates an unnecessary list
let case9 () = for x in [0..10] do print x
When F# compiler don't have special handling to ensure quick iteration it falls back on generic code that looks a bit like this:
use e = (Operators.OperatorIntrinsics.RangeInt32 0 2 10).GetEnumerator()
while enumerator.MoveNext() do
print enumerator.Current
This might or might not be a problem to you but it's worth knowing about I think.
IMHO tail recursion is the way to loop as for and while has a kind of imperative taste to them and thanks to tail call optimization in F# tail recursion is fast if written correctly.
let rec loop i =
if i < someLength then
doSomething i
loop (i + 2)
loop 0
Tomas already answered your syntax question. Another answer suggests using tail recursion instead.
A third approach with a more f-sharpy feel to it would be something like this:
let myArray = [| 1; 2; 3 ; 4 |]
let stepper f step a =
a
|> Array.mapi (fun x i -> if i % step = 0 then Some (f x) else None)
|> Array.choose id
printfn "%A" <| stepper (fun x -> x * 2) 2 myArray
// prints [|2; 6|]

Functional digits reversion

In C, I would solve the problem with a loop. To represent the idea, something like:
void foo(int x){
while(x > 0){
printf("%d", x % 10);
x /= 10;
}
}
With F#, I am unable to make the function return the single values. I tried:
let reverse =
let aux =
fun x ->
x % 10
let rec aux2 =
fun x ->
if x = 0 then 0
else aux2(aux(x / 10))
aux2 n
but it returns always the base case 0.
I cannot get my mind beyond this approach, where the recursion results are maintained with an operation, and cannot be reported (according to may comprehension) individually:
let reverse2 =
let rec aux =
fun x ->
if x = 0 then 0
else (x % 10) + aux (x / 10) // The operation returning the result
aux n
This is a simple exercise I am doing in order to "functionalize" my mind. Hence, I am looking for an approach to this problem not involving library functions.
A for loop that changes the value of mutable variables can be rewritten as a recursive function. You can think of the mutable variables as implicit parameters to the function. So if we have a mutable variable x, we need to instead pass the new state of x explicitly as a function parameter. The closest equivalent to your C function as a recursive F# function is this:
let rec foo x =
if x > 0 then
printf "%d" (x % 10)
foo (x / 10)
This in itself isn't particularly functional because it returns unit and only has side effects. You can collect the result of each loop using another parameter. This is often called an accumulator:
let foo x =
let rec loop x acc =
if x > 0 then
loop (x / 10) (x % 10 :: acc)
else acc
loop x [] |> List.rev
foo 100 // [0; 0; 1]
I made an inner loop function that is actually the recursive one. The outer foo function starts off the inner loop with [] as the accumulator. Items are added to the start of the list during each iteration and the accumulator list is reversed at the end.
You can use another type as the accumulator, e.g. a string, and append to the string instead of adding items to the list.

F# Parallel.ForEach invalid method overload

Creating a Parallel.ForEach expression of this form:
let low = max 1 (k-m)
let high = min (k-1) n
let rangesize = (high+1-low)/(PROCS*3)
Parallel.ForEach(Partitioner.Create(low, high+1, rangesize), (fun j ->
let i = k - j
if x.[i-1] = y.[j-1] then
a.[i] <- b.[i-1] + 1
else
a.[i] <- max c.[i] c.[i-1]
)) |> ignore
Causes me to receive the error: No overloads match for method 'ForEach'. However I am using the Parallel.ForEach<TSource> Method (Partitioner<TSource>, Action<TSource>) and it seems right to me. Am I missing something?
Edited: I am trying to obtain the same results as the code below (that does not use a Partitioner):
let low = max 1 (k-m)
let high = min (k-1) n
let rangesize = (high+1-low)/(PROCS*3)
let A = [| low .. high |]
Parallel.ForEach(A, fun (j:int) ->
let i = k - j
if x.[i-1] = y.[j-1] then
a.[i] <- b.[i-1] + 1
else
a.[i] <- max c.[i] c.[i-1]
) |> ignore
Are you sure that you have opened all necessary namespaces, all the values you are using (low, high and PROCS) are defined and that your code does not accidentally redefine some of the names that you're using (like Partitioner)?
I created a very simple F# script with this code and it seems to be working fine (I refactored the code to create a partitioner called p, but that does not affect the behavior):
open System.Threading.Tasks
open System.Collections.Concurrent
let PROCS = 10
let low, high = 0, 100
let p = Partitioner.Create(low, high+1, high+1-low/(PROCS*3))
Parallel.ForEach(p, (fun j ->
printfn "%A" j // Print the desired range (using %A as it is a tuple)
)) |> ignore
It is important that the value j is actually a pair of type int * int, so if the body uses it in a wrong way (e.g. as an int), you will get the error. In that case, you can add a type annotation to j and you would get a more useful error elsewhere:
Parallel.ForEach(p, (fun (j:int * int) ->
printfn "%d" j // Error here, because `j` is used as an int, but it is a pair!
)) |> ignore
This means that if you want to perform something for all j values in the original range, you need to write something like this:
Parallel.ForEach(p, (fun (loJ, hiJ) ->
for j in loJ .. hiJ - 1 do // Iterate over all js in this partition
printfn "%d" j // process the current j
)) |> ignore
Aside, I guess that the last argument to Partitioner.Create should actually be (high+1-low)/(PROCS*3) - you probably want to divide the total number of steps, not just the low value.

f#: initialize array of array

in the following code, does array of array A = B?
let A = Array.init 3 (fun _ -> Array.init 2 (fun _ -> 0))
let defaultCreate n defaultValue = Array.init n (fun _ -> defaultValue)
let B = defaultCreate 3 (defaultCreate 2 0)
if I assign values to A and B, they are different ,what happened? thanks.
for i = 0 to 2 do
for j = 0 to 1 do
A.[i].[j] <-i + j
B.[i].[j] <-i + j
printfn "%A vs %A" A B
A = [|[|0; 1|]; [|1; 2|]; [|2; 3|]|] and B = [|[|2; 3|]; [|2; 3|]; [|2; 3|]|]
let B = defaultCreate 3 (defaultCreate 2 0)
You create an array and then you use this array as values for each cell.
It's as if you did something like this:
let a = [|1; 2; 3; 4|]
let b = [|a; a; a; a|]
The same array a is used for every cell (think pointer to a is you're used to C). Thus, modifying b.[0].[1] will change every a.[1].
In my sample:
> b.[0].[1] <- 10;;
val it : unit = ()
> b;;
[|[|1; 10; 3; 4|]; [|1; 10; 3; 4|]; [|1; 10; 3; 4|]; [|1; 10; 3; 4|]|]
The same thing happens with your code.
They are not the same.
Arrays are reference types, and are stored on the heap. When you create an array with another array as the default value, you are storing references to the same array, over and over again.
Numbers are another thing. They are immutable, and are stored by value, on the stack. So you can't change the value of 1 to anything other than 1.
To create an "jagged" array, you need to call Array.init from inside the initializer to the first Array.init call, to create new arrays for each slot.
Also; You could use Array.create if you do want to have the same value in every slot. Be careful about reference types though.
let A = Array.init 3 (fun _ -> Array.create 2 0)

F#: How do i split up a sequence into a sequence of sequences

Background:
I have a sequence of contiguous, time-stamped data. The data-sequence has gaps in it where the data is not contiguous. I want create a method to split the sequence up into a sequence of sequences so that each subsequence contains contiguous data (split the input-sequence at the gaps).
Constraints:
The return value must be a sequence of sequences to ensure that elements are only produced as needed (cannot use list/array/cacheing)
The solution must NOT be O(n^2), probably ruling out a Seq.take - Seq.skip pattern (cf. Brian's post)
Bonus points for a functionally idiomatic approach (since I want to become more proficient at functional programming), but it's not a requirement.
Method signature
let groupContiguousDataPoints (timeBetweenContiguousDataPoints : TimeSpan) (dataPointsWithHoles : seq<DateTime * float>) : (seq<seq< DateTime * float >>)= ...
On the face of it the problem looked trivial to me, but even employing Seq.pairwise, IEnumerator<_>, sequence comprehensions and yield statements, the solution eludes me. I am sure that this is because I still lack experience with combining F#-idioms, or possibly because there are some language-constructs that I have not yet been exposed to.
// Test data
let numbers = {1.0..1000.0}
let baseTime = DateTime.Now
let contiguousTimeStamps = seq { for n in numbers ->baseTime.AddMinutes(n)}
let dataWithOccationalHoles = Seq.zip contiguousTimeStamps numbers |> Seq.filter (fun (dateTime, num) -> num % 77.0 <> 0.0) // Has a gap in the data every 77 items
let timeBetweenContiguousValues = (new TimeSpan(0,1,0))
dataWithOccationalHoles |> groupContiguousDataPoints timeBetweenContiguousValues |> Seq.iteri (fun i sequence -> printfn "Group %d has %d data-points: Head: %f" i (Seq.length sequence) (snd(Seq.hd sequence)))
I think this does what you want
dataWithOccationalHoles
|> Seq.pairwise
|> Seq.map(fun ((time1,elem1),(time2,elem2)) -> if time2-time1 = timeBetweenContiguousValues then 0, ((time1,elem1),(time2,elem2)) else 1, ((time1,elem1),(time2,elem2)) )
|> Seq.scan(fun (indexres,(t1,e1),(t2,e2)) (index,((time1,elem1),(time2,elem2))) -> (index+indexres,(time1,elem1),(time2,elem2)) ) (0,(baseTime,-1.0),(baseTime,-1.0))
|> Seq.map( fun (index,(time1,elem1),(time2,elem2)) -> index,(time2,elem2) )
|> Seq.filter( fun (_,(_,elem)) -> elem <> -1.0)
|> PSeq.groupBy(fst)
|> Seq.map(snd>>Seq.map(snd))
Thanks for asking this cool question
I translated Alexey's Haskell to F#, but it's not pretty in F#, and still one element too eager.
I expect there is a better way, but I'll have to try again later.
let N = 20
let data = // produce some arbitrary data with holes
seq {
for x in 1..N do
if x % 4 <> 0 && x % 7 <> 0 then
printfn "producing %d" x
yield x
}
let rec GroupBy comp (input:LazyList<'a>) : LazyList<LazyList<'a>> =
LazyList.delayed (fun () ->
match input with
| LazyList.Nil -> LazyList.cons (LazyList.empty()) (LazyList.empty())
| LazyList.Cons(x,LazyList.Nil) ->
LazyList.cons (LazyList.cons x (LazyList.empty())) (LazyList.empty())
| LazyList.Cons(x,(LazyList.Cons(y,_) as xs)) ->
let groups = GroupBy comp xs
if comp x y then
LazyList.consf
(LazyList.consf x (fun () ->
let (LazyList.Cons(firstGroup,_)) = groups
firstGroup))
(fun () ->
let (LazyList.Cons(_,otherGroups)) = groups
otherGroups)
else
LazyList.cons (LazyList.cons x (LazyList.empty())) groups)
let result = data |> LazyList.of_seq |> GroupBy (fun x y -> y = x + 1)
printfn "Consuming..."
for group in result do
printfn "about to do a group"
for x in group do
printfn " %d" x
You seem to want a function that has signature
(`a -> bool) -> seq<'a> -> seq<seq<'a>>
I.e. a function and a sequence, then break up the input sequence into a sequence of sequences based on the result of the function.
Caching the values into a collection that implements IEnumerable would likely be simplest (albeit not exactly purist, but avoiding iterating the input multiple times. It will lose much of the laziness of the input):
let groupBy (fun: 'a -> bool) (input: seq) =
seq {
let cache = ref (new System.Collections.Generic.List())
for e in input do
(!cache).Add(e)
if not (fun e) then
yield !cache
cache := new System.Collections.Generic.List()
if cache.Length > 0 then
yield !cache
}
An alternative implementation could pass cache collection (as seq<'a>) to the function so it can see multiple elements to chose the break points.
A Haskell solution, because I don't know F# syntax well, but it should be easy enough to translate:
type TimeStamp = Integer -- ticks
type TimeSpan = Integer -- difference between TimeStamps
groupContiguousDataPoints :: TimeSpan -> [(TimeStamp, a)] -> [[(TimeStamp, a)]]
There is a function groupBy :: (a -> a -> Bool) -> [a] -> [[a]] in the Prelude:
The group function takes a list and returns a list of lists such that the concatenation of the result is equal to the argument. Moreover, each sublist in the result contains only equal elements. For example,
group "Mississippi" = ["M","i","ss","i","ss","i","pp","i"]
It is a special case of groupBy, which allows the programmer to supply their own equality test.
It isn't quite what we want, because it compares each element in the list with the first element of the current group, and we need to compare consecutive elements. If we had such a function groupBy1, we could write groupContiguousDataPoints easily:
groupContiguousDataPoints maxTimeDiff list = groupBy1 (\(t1, _) (t2, _) -> t2 - t1 <= maxTimeDiff) list
So let's write it!
groupBy1 :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy1 _ [] = [[]]
groupBy1 _ [x] = [[x]]
groupBy1 comp (x : xs#(y : _))
| comp x y = (x : firstGroup) : otherGroups
| otherwise = [x] : groups
where groups#(firstGroup : otherGroups) = groupBy1 comp xs
UPDATE: it looks like F# doesn't let you pattern match on seq, so it isn't too easy to translate after all. However, this thread on HubFS shows a way to pattern match sequences by converting them to LazyList when needed.
UPDATE2: Haskell lists are lazy and generated as needed, so they correspond to F#'s LazyList (not to seq, because the generated data is cached (and garbage collected, of course, if you no longer hold a reference to it)).
(EDIT: This suffers from a similar problem to Brian's solution, in that iterating the outer sequence without iterating over each inner sequence will mess things up badly!)
Here's a solution that nests sequence expressions. The imperitave nature of .NET's IEnumerable<T> is pretty apparent here, which makes it a bit harder to write idiomatic F# code for this problem, but hopefully it's still clear what's going on.
let groupBy cmp (sq:seq<_>) =
let en = sq.GetEnumerator()
let rec partitions (first:option<_>) =
seq {
match first with
| Some first' -> //'
(* The following value is always overwritten;
it represents the first element of the next subsequence to output, if any *)
let next = ref None
(* This function generates a subsequence to output,
setting next appropriately as it goes *)
let rec iter item =
seq {
yield item
if (en.MoveNext()) then
let curr = en.Current
if (cmp item curr) then
yield! iter curr
else // consumed one too many - pass it on as the start of the next sequence
next := Some curr
else
next := None
}
yield iter first' (* ' generate the first sequence *)
yield! partitions !next (* recursively generate all remaining sequences *)
| None -> () // return an empty sequence if there are no more values
}
let first = if en.MoveNext() then Some en.Current else None
partitions first
let groupContiguousDataPoints (time:TimeSpan) : (seq<DateTime*_> -> _) =
groupBy (fun (t,_) (t',_) -> t' - t <= time)
Okay, trying again. Achieving the optimal amount of laziness turns out to be a bit difficult in F#... On the bright side, this is somewhat more functional than my last attempt, in that it doesn't use any ref cells.
let groupBy cmp (sq:seq<_>) =
let en = sq.GetEnumerator()
let next() = if en.MoveNext() then Some en.Current else None
(* this function returns a pair containing the first sequence and a lazy option indicating the first element in the next sequence (if any) *)
let rec seqStartingWith start =
match next() with
| Some y when cmp start y ->
let rest_next = lazy seqStartingWith y // delay evaluation until forced - stores the rest of this sequence and the start of the next one as a pair
seq { yield start; yield! fst (Lazy.force rest_next) },
lazy Lazy.force (snd (Lazy.force rest_next))
| next -> seq { yield start }, lazy next
let rec iter start =
seq {
match (Lazy.force start) with
| None -> ()
| Some start ->
let (first,next) = seqStartingWith start
yield first
yield! iter next
}
Seq.cache (iter (lazy next()))
Below is some code that does what I think you want. It is not idiomatic F#.
(It may be similar to Brian's answer, though I can't tell because I'm not familiar with the LazyList semantics.)
But it doesn't exactly match your test specification: Seq.length enumerates its entire input. Your "test code" calls Seq.length and then calls Seq.hd. That will generate an enumerator twice, and since there is no caching, things get messed up. I'm not sure if there is any clean way to allow multiple enumerators without caching. Frankly, seq<seq<'a>> may not be the best data structure for this problem.
Anyway, here's the code:
type State<'a> = Unstarted | InnerOkay of 'a | NeedNewInner of 'a | Finished
// f() = true means the neighbors should be kept together
// f() = false means they should be split
let split_up (f : 'a -> 'a -> bool) (input : seq<'a>) =
// simple unfold that assumes f captured a mutable variable
let iter f = Seq.unfold (fun _ ->
match f() with
| Some(x) -> Some(x,())
| None -> None) ()
seq {
let state = ref (Unstarted)
use ie = input.GetEnumerator()
let innerMoveNext() =
match !state with
| Unstarted ->
if ie.MoveNext()
then let cur = ie.Current
state := InnerOkay(cur); Some(cur)
else state := Finished; None
| InnerOkay(last) ->
if ie.MoveNext()
then let cur = ie.Current
if f last cur
then state := InnerOkay(cur); Some(cur)
else state := NeedNewInner(cur); None
else state := Finished; None
| NeedNewInner(last) -> state := InnerOkay(last); Some(last)
| Finished -> None
let outerMoveNext() =
match !state with
| Unstarted | NeedNewInner(_) -> Some(iter innerMoveNext)
| InnerOkay(_) -> failwith "Move to next inner seq when current is active: undefined behavior."
| Finished -> None
yield! iter outerMoveNext }
open System
let groupContigs (contigTime : TimeSpan) (holey : seq<DateTime * int>) =
split_up (fun (t1,_) (t2,_) -> (t2 - t1) <= contigTime) holey
// Test data
let numbers = {1 .. 15}
let contiguousTimeStamps =
let baseTime = DateTime.Now
seq { for n in numbers -> baseTime.AddMinutes(float n)}
let holeyData =
Seq.zip contiguousTimeStamps numbers
|> Seq.filter (fun (dateTime, num) -> num % 7 <> 0)
let grouped_data = groupContigs (new TimeSpan(0,1,0)) holeyData
printfn "Consuming..."
for group in grouped_data do
printfn "about to do a group"
for x in group do
printfn " %A" x
Ok, here's an answer I'm not unhappy with.
(EDIT: I am unhappy - it's wrong! No time to try to fix right now though.)
It uses a bit of imperative state, but it is not too difficult to follow (provided you recall that '!' is the F# dereference operator, and not 'not'). It is as lazy as possible, and takes a seq as input and returns a seq of seqs as output.
let N = 20
let data = // produce some arbitrary data with holes
seq {
for x in 1..N do
if x % 4 <> 0 && x % 7 <> 0 then
printfn "producing %d" x
yield x
}
let rec GroupBy comp (input:seq<_>) = seq {
let doneWithThisGroup = ref false
let areMore = ref true
use e = input.GetEnumerator()
let Next() = areMore := e.MoveNext(); !areMore
// deal with length 0 or 1, seed 'prev'
if not(e.MoveNext()) then () else
let prev = ref e.Current
while !areMore do
yield seq {
while not(!doneWithThisGroup) do
if Next() then
let next = e.Current
doneWithThisGroup := not(comp !prev next)
yield !prev
prev := next
else
// end of list, yield final value
yield !prev
doneWithThisGroup := true }
doneWithThisGroup := false }
let result = data |> GroupBy (fun x y -> y = x + 1)
printfn "Consuming..."
for group in result do
printfn "about to do a group"
for x in group do
printfn " %d" x

Resources