Seq.cache with Seq.unfold in F# - f#

I'm trying to cache the result of an infinite sequence of triangle numbers.
let triCalc n = (n*(n+1))/2
let triNumC = 0 |> Seq.unfold (fun n -> Some((triCalc n, n+1))) |> Seq.cache
However, when I try to cache it doesn't seem to work as I expect
let x = Seq.take 4 (triNumC)
let y = Seq.take 4 (triNumC)
printf "%A" x
printf "%A" y
This prints
seq [0; 1; 3; 6]seq [0; 1; 3; 6]
Am I caching incorrectly? Shouldn't the second sequence printed be a continuation of the first? I am not quite sure how to progress.

If you want the continuation you need to skip. The sequence remains the same
let y = triNumC
|> Seq.skip 4
|> Seq.take 4
Caching is fine. The benefit of caching is that when calculating y is that starts from the 5th term onwards since the first 4 are already cached.

Related

Seq.filter and infinite seems counterintuitive

I'm trying to understand the results of applying filters over infinite lists. I'll need an explanation to understand what is happening under the covers.
let nums = Seq.initInfinite id |> Seq.filter ((>) 0);;
nums |> Seq.take 0;; // of course, this works
nums |> Seq.take 1;; // this overflows
To my reading, line one means "start an infinite sequence at zero and filter out all values greater than zero (such as 1, 2, 3 etc...)". Therefore, getting zero values returns an empty set. Ok, makes sense. But if we take one value, and that value would be zero which is clearly not greater than zero, why the overflow?
Experimenting further:
let nums = Seq.initInfinite (fun i -> i - 1) |> Seq.filter ((>) 0);;
nums |> Seq.take 1;; // this works and returns -1
nums |> Seq.take 2;; // this overflows
Again, as I read this, zero should be a valid value to which the sequence can iterate.
Compound my confusion:
let nums = Seq.initInfinite id |> Seq.filter ((>=) 0);;
nums |> Seq.take 1;; // works, [0]
Maybe I can reconcile the behavior by assuming the filter actually means, "values are not valid UNLESS they are >= 0". But that doesn't merit out:
let nums = Seq.initInfinite (fun i -> i - 1) |> Seq.filter ((>=) 0);;
nums |> Seq.take 1;; // works [-1]
nums |> Seq.take 2;; // works [-1, 0]
It cannot be the case that the filter defines valid values... for -1 is not >= 0 in the case of filter ((>=) 0).
It cannot be the case that the filter defines invalid values... for 0 is not > 0 in the case of filter ((>) 0).
Please unscramble this for me.
Because of how you're using the predicate it's quite confusing what it actually means. Let's solve that problem first, by pushing a finite sequance with known set of values by your filter.
> let input = [-5; 0; 5 ];;
val input : int list = [-5; 0; 5]
> let output = input |> Seq.filter ((>) 0) |> Seq.toList;;
val output : int list = [-5]
Wait, why is -5 returned? That's because (>) 0 does not actually mean x > 0, it means (>) 0 x, which is equivalent to 0 > x. So your Seq.filter ((>) 0) means the same as Seq.filter (fun i -> 0 > i):
> let output = input |> Seq.filter (fun i -> 0 > i) |> Seq.toList;;
val output : int list = [-5]
Now that we have that figured out, let's try looking into the overflow.
let nums = Seq.initInfinite id |> Seq.filter ((>) 0);;
Seq.initInfinite id returns an infinite collection starting with 0 with next element equals to prev + 1.
> Seq.initInfinite id |> Seq.take 5 |> Seq.toList;;
val it : int list = [0; 1; 2; 3; 4] // this will keep going if you remove Seq.take
Now you're adding filter, which only returns element smaller than 0. How many element like this there are? None!. So your filter will keep asking for next element out of Seq.initInfinite until you overflow Integer range and exception is thrown.
Sequences are lazily evaluated, so at first glance this behaviour seems a bit strange since you aren't calling toList, toArray etc.
In fact it's only overflowing because you're using F# interactive, which tries to print the first few elements of the sequence when you press return. It therefore attempts to iterate through your infinite sequence until it finds some items that match your predicate and never finds any, hence the overflow.

Problems Creating an Infinite Lazy List

I completed the seventh Euler problem* in F# but am not entirely happy with my
implementation. In the function primes I create a sequence that I estimated would contain the 10,001st prime number. When I tried using Seq.initInfinite to lazily generate the candidate primes my code just hung before throwing an out of memory exception.
Could someone advise me on replacing the literal sequence with a lazily-generated sequence which is short-circuited once the desired prime is found?
let isPrime n =
let bound = int (sqrt (float n))
seq {2 .. bound} |> Seq.forall (fun x -> n % x <> 0)
let primeAsync n =
async { return (n, isPrime n)}
let primes =
{1..1000000}
|> Seq.map primeAsync
|> Async.Parallel
|> Async.RunSynchronously
|> Array.filter snd
|> Array.map fst
|> Array.mapi (fun i el -> (i, el))
|> Array.find (fun (fst, snd) -> fst = 10001)
primes
*"By listing the first six prime numbers: 2, 3, 5, 7, 11, and 13, we can see that the 6th prime is 13. What is the 10,001st prime number?"
I think the problem is/was that Async.RunSynchronously isn't lazy and tried to evaluate the whole infinite sequence. Although there are better solutions for this, your algorithm is fast enough, so you don't even need parallelization; this works perfectly:
open System
let isPrime n =
let bound = n |> float |> sqrt |> int
seq {2 .. bound} |> Seq.forall (fun x -> n % x <> 0)
let prime =
Seq.initInfinite ((+) 2)
|> Seq.filter isPrime
|> Seq.skip 10000
|> Seq.head
The sequence gets 'reified' as soon as you feed it to Async.Parallel. If you want to minimise memory consumption, run the computation serially or split it into lazy chunks, the elements in each chunk to be run in parallel.

F# divide sequence up in blocks [duplicate]

I'm trying to learn F# by rewriting some C# algorithms I have into idiomatic F#.
One of the first functions I'm trying to rewrite is a batchesOf where:
[1..17] |> batchesOf 5
Which would split the sequence into batches with a max of five in each, i.e:
[[1; 2; 3; 4; 5]; [6; 7; 8; 9; 10]; [11; 12; 13; 14; 15]; [16; 17]]
My first attempt at doing this is kind of ugly where I've resorted to using a mutable ref object after running into errors trying to use mutable type inside the closure. Using ref is particularly unpleasant since to dereference it you have to use the ! operator which when inside a condition expression can be counter intuitive to some devs who will read it as logical not. Another problem I ran into is where Seq.skip and Seq.take are not like their Linq aliases in that they will throw an error if size exceeds the size of the sequence.
let batchesOf size (sequence: _ seq) : _ list seq =
seq {
let s = ref sequence
while not (!s |> Seq.isEmpty) do
yield !s |> Seq.truncate size |> List.ofSeq
s := System.Linq.Enumerable.Skip(!s, size)
}
Anyway what would be the most elegant/idiomatic way to rewrite this in F#? Keeping the original behaviour but preferably without the ref mutable variable.
Implementing this function using the seq<_> type idiomatically is difficult - the type is inherently mutable, so there is no simple nice functional way. Your version is quite inefficient, because it uses Skip repeatedly on the sequence. A better imperative option would be to use GetEnumerator and just iterate over elements using IEnumerator. You can find various imperative options in this snippet: http://fssnip.net/1o
If you're learning F#, then it is better to try writing the function using F# list type. This way, you can use idiomatic functional style. Then you can write batchesOf using pattern matching with recursion and accumulator argument like this:
let batchesOf size input =
// Inner function that does the actual work.
// 'input' is the remaining part of the list, 'num' is the number of elements
// in a current batch, which is stored in 'batch'. Finally, 'acc' is a list of
// batches (in a reverse order)
let rec loop input num batch acc =
match input with
| [] ->
// We've reached the end - add current batch to the list of all
// batches if it is not empty and return batch (in the right order)
if batch <> [] then (List.rev batch)::acc else acc
|> List.rev
| x::xs when num = size - 1 ->
// We've reached the end of the batch - add the last element
// and add batch to the list of batches.
loop xs 0 [] ((List.rev (x::batch))::acc)
| x::xs ->
// Take one element from the input and add it to the current batch
loop xs (num + 1) (x::batch) acc
loop input 0 [] []
As a footnote, the imperative version can be made a bit nicer using computation expression for working with IEnumerator, but that's not standard and it is quite advanced trick (for example, see http://fssnip.net/37).
A friend asked me this a while back. Here's a recycled answer. This works and is pure:
let batchesOf n =
Seq.mapi (fun i v -> i / n, v) >>
Seq.groupBy fst >>
Seq.map snd >>
Seq.map (Seq.map snd)
Or an impure version:
let batchesOf n =
let i = ref -1
Seq.groupBy (fun _ -> i := !i + 1; !i / n) >> Seq.map snd
These produce a seq<seq<'a>>. If you really must have an 'a list list as in your sample then just add ... |> Seq.map (List.ofSeq) |> List.ofSeq as in:
> [1..17] |> batchesOf 5 |> Seq.map (List.ofSeq) |> List.ofSeq;;
val it : int list list = [[1; 2; 3; 4; 5]; [6; 7; 8; 9; 10]; [11; 12; 13; 14; 15]; [16; 17]]
Hope that helps!
This can be done without recursion if you want
[0..20]
|> Seq.mapi (fun i elem -> (i/size),elem)
|> Seq.groupBy (fun (a,_) -> a)
|> Seq.map (fun (_,se) -> se |> Seq.map (snd));;
val it : seq<seq<int>> =
seq
[seq [0; 1; 2; 3; ...]; seq [5; 6; 7; 8; ...]; seq [10; 11; 12; 13; ...];
seq [15; 16; 17; 18; ...]; ...]
Depending on how you think this may be easier to understand. Tomas' solution is probably more idiomatic F# though
Hurray, we can use List.chunkBySize, Seq.chunkBySize and Array.chunkBySize in F# 4, as mentioned by Brad Collins and Scott Wlaschin.
This isn't perhaps idiomatic but it works:
let batchesOf n l =
let _, _, temp', res' = List.fold (fun (i, n, temp, res) hd ->
if i < n then
(i + 1, n, hd :: temp, res)
else
(1, i, [hd], (List.rev temp) :: res))
(0, n, [], []) l
(List.rev temp') :: res' |> List.rev
Here's a simple implementation for sequences:
let chunks size (items:seq<_>) =
use e = items.GetEnumerator()
let rec loop i acc =
seq {
if i = size then
yield (List.rev acc)
yield! loop 0 []
elif e.MoveNext() then
yield! loop (i+1) (e.Current::acc)
else
yield (List.rev acc)
}
if size = 0 then invalidArg "size" "must be greater than zero"
if Seq.isEmpty items then Seq.empty else loop 0 []
let s = Seq.init 10 id
chunks 3 s
//output: seq [[0; 1; 2]; [3; 4; 5]; [6; 7; 8]; [9]]
My method involves converting the list to an array and recursively chunking the array:
let batchesOf (sz:int) lt =
let arr = List.toArray lt
let rec bite curr =
if (curr + sz - 1 ) >= arr.Length then
[Array.toList arr.[ curr .. (arr.Length - 1)]]
else
let curr1 = curr + sz
(Array.toList (arr.[curr .. (curr + sz - 1)])) :: (bite curr1)
bite 0
batchesOf 5 [1 .. 17]
[[1; 2; 3; 4; 5]; [6; 7; 8; 9; 10]; [11; 12; 13; 14; 15]; [16; 17]]
I found this to be a quite terse solution:
let partition n (stream:seq<_>) = seq {
let enum = stream.GetEnumerator()
let rec collect n partition =
if n = 1 || not (enum.MoveNext()) then
partition
else
collect (n-1) (partition # [enum.Current])
while enum.MoveNext() do
yield collect n [enum.Current]
}
It works on a sequence and produces a sequence. The output sequence consists of lists of n elements from the input sequence.
You can solve your task with analog of Clojure partition library function below:
let partition n step coll =
let rec split ss =
seq {
yield(ss |> Seq.truncate n)
if Seq.length(ss |> Seq.truncate (step+1)) > step then
yield! split <| (ss |> Seq.skip step)
}
split coll
Being used as partition 5 5 it will provide you with sought batchesOf 5 functionality:
[1..17] |> partition 5 5;;
val it : seq<seq<int>> =
seq
[seq [1; 2; 3; 4; ...]; seq [6; 7; 8; 9; ...]; seq [11; 12; 13; 14; ...];
seq [16; 17]]
As a premium by playing with n and step you can use it for slicing overlapping batches aka sliding windows, and even apply to infinite sequences, like below:
Seq.initInfinite(fun x -> x) |> partition 4 1;;
val it : seq<seq<int>> =
seq
[seq [0; 1; 2; 3]; seq [1; 2; 3; 4]; seq [2; 3; 4; 5]; seq [3; 4; 5; 6];
...]
Consider it as a prototype only as it does many redundant evaluations on the source sequence and not likely fit for production purposes.
This version passes all my tests I could think of including ones for lazy evaluation and single sequence evaluation:
let batchIn batchLength sequence =
let padding = seq { for i in 1 .. batchLength -> None }
let wrapped = sequence |> Seq.map Some
Seq.concat [wrapped; padding]
|> Seq.windowed batchLength
|> Seq.mapi (fun i el -> (i, el))
|> Seq.filter (fun t -> fst t % batchLength = 0)
|> Seq.map snd
|> Seq.map (Seq.choose id)
|> Seq.filter (fun el -> not (Seq.isEmpty el))
I am still quite new to F# so if I'm missing anything - please do correct me, it will be greatly appreciated.

What's the style for immutable set and map in F#

I have just solved problem23 in Project Euler, in which I need a set to store all abundant numbers. F# has a immutable set, I can use Set.empty.Add(i) to create a new set containing number i. But I don't know how to use immutable set to do more complicated things.
For example, in the following code, I need to see if a number 'x' could be written as the sum of two numbers in a set. I resort to a sorted array and array's binary search algorithm to get the job done.
Please also comment on my style of the following program. Thanks!
let problem23 =
let factorSum x =
let mutable sum = 0
for i=1 to x/2 do
if x%i=0 then
sum <- sum + i
sum
let isAbundant x = x < (factorSum x)
let abuns = {1..28123} |> Seq.filter isAbundant |> Seq.toArray
let inAbuns x = Array.BinarySearch(abuns, x) >= 0
let sumable x =
abuns |> Seq.exists (fun a -> inAbuns (x-a))
{1..28123} |> Seq.filter (fun x -> not (sumable x)) |> Seq.sum
the updated version:
let problem23b =
let factorSum x =
{1..x/2} |> Seq.filter (fun i->x%i=0) |> Seq.sum
let isAbundant x = x < (factorSum x)
let abuns = Set( {1..28123} |> Seq.filter isAbundant )
let inAbuns x = Set.contains x abuns
let sumable x =
abuns |> Seq.exists (fun a -> inAbuns (x-a))
{1..28123} |> Seq.filter (fun x -> not (sumable x)) |> Seq.sum
This version runs in about 27 seconds, while the first 23 seconds(I've run several times). So an immutable red-black tree actually does not have much speed down compared to a sorted array with binary search. The total number of elements in the set/array is 6965.
Your style looks fine to me. The different steps in the algorithm are clear, which is the most important part of making something work. This is also the tactic I use for solving Project Euler problems. First make it work, and then make it fast.
As already remarked, replacing Array.BinarySearch by Set.contains makes the code even more readable. I find that in almost all PE solutions I've written, I only use arrays for lookups. I've found that using sequences and lists as data structures is more natural within F#. Once you get used to them, that is.
I don't think using mutability inside a function is necessarily bad. I've optimized problem 155 from almost 3 minutes down to 7 seconds with some aggressive mutability optimizations. In general though, I'd save that as an optimization step and start out writing it using folds/filters etc. In the example case of problem 155, I did start out using immutable function composition, because it made testing and most importantly, understanding, my approach easy.
Picking the wrong algorithm is much more detrimental to a solution than using a somewhat slower immutable approach first. A good algorithm is still fast even if it's slower than the mutable version (couch hello captain obvious! cough).
Edit: let's look at your version
Your problem23b() took 31 seconds on my PC.
Optimization 1: use new algorithm.
//useful optimization: if m divides n, (n/m) divides n also
//you now only have to check m up to sqrt(n)
let factorSum2 n =
let rec aux acc m =
match m with
| m when m*m = n -> acc + m
| m when m*m > n -> acc
| m -> aux (acc + (if n%m=0 then m + n/m else 0)) (m+1)
aux 1 2
This is still very much in functional style, but using this updated factorSum in your code, the execution time went from 31 seconds to 8 seconds.
Everything's still in immutable style, but let's see what happens when an array lookup is used instead of a set:
Optimization 2: use an array for lookup:
let absums() =
//create abundant numbers as an array for (very) fast lookup
let abnums = [|1..28128|] |> Array.filter (fun n -> factorSum2 n > n)
//create a second lookup:
//a boolean array where arr.[x] = true means x is a sum of two abundant numbers
let arr = Array.zeroCreate 28124
for x in abnums do
for y in abnums do
if x+y<=28123 then arr.[x+y] <- true
arr
let euler023() =
absums() //the array lookup
|> Seq.mapi (fun i isAbsum -> if isAbsum then 0 else i) //mapi: i is the position in the sequence
|> Seq.sum
//I always write a test once I've solved a problem.
//In this way, I can easily see if changes to the code breaks stuff.
let test() = euler023() = 4179871
Execution time: 0.22 seconds (!).
This is what I like so much about F#, it still allows you to use mutable constructs to tinker under the hood of your algorithm. But I still only do this after I've made something more elegant work first.
You can easily create a Set from a given sequence of values.
let abuns = Set (seq {1..28123} |> Seq.filter isAbundant)
inAbuns would therefore be rewritten to
let inAbuns x = abuns |> Set.mem x
Seq.exists would be changed to Set.exists
But the array implementation is fine too ...
Note that there is no need to use mutable values in factorSum, apart from the fact that it's incorrect since you compute the number of divisors instead of their sum:
let factorSum x = seq { 1..x/2 } |> Seq.filter (fun i -> x % i = 0) |> Seq.sum
Here is a simple functional solution that is shorter than your original and over 100× faster:
let problem23 =
let rec isAbundant i t x =
if i > x/2 then x < t else
if x % i = 0 then isAbundant (i+1) (t+i) x else
isAbundant (i+1) t x
let xs = Array.Parallel.init 28124 (isAbundant 1 0)
let ys = Array.mapi (fun i b -> if b then Some i else None) xs |> Array.choose id
let f x a = x-a < 0 || not xs.[x-a]
Array.init 28124 (fun x -> if Array.forall (f x) ys then x else 0)
|> Seq.sum
The first trick is to record which numbers are abundant in an array indexed by the number itself rather than using a search structure. The second trick is to notice that all the time is spent generating that array and, therefore, to do it in parallel.

Get a random subset from a set in F#

I am trying to think of an elegant way of getting a random subset from a set in F#
Any thoughts on this?
Perhaps this would work: say we have a set of 2x elements and we need to pick a subset of y elements. Then if we could generate an x sized bit random number that contains exactly y 2n powers we effectively have a random mask with y holes in it. We could keep generating new random numbers until we get the first one satisfying this constraint but is there a better way?
If you don't want to convert to an array you could do something like this. This is O(n*m) where m is the size of the set.
open System
let rnd = Random(0);
let set = Array.init 10 (fun i -> i) |> Set.of_array
let randomSubSet n set =
seq {
let i = set |> Set.to_seq |> Seq.nth (rnd.Next(set.Count))
yield i
yield! set |> Set.remove i
}
|> Seq.take n
|> Set.of_seq
let result = set |> randomSubSet 3
for x in result do
printfn "%A" x
Agree with #JohannesRossel. There's an F# shuffle-an-array algorithm here you can modify suitably. Convert the Set into an array, and then loop until you've selected enough random elements for the new subset.
Not having a really good grasp of F# and what might be considered elegant there, you could just do a shuffle on the list of elements and select the first y. A Fisher-Yates shuffle even helps you in this respect as you also only need to shuffle y elements.
rnd must be out of subset function.
let rnd = new Random()
let rec subset xs =
let removeAt n xs = ( Seq.nth (n-1) xs, Seq.append (Seq.take (n-1) xs) (Seq.skip n xs) )
match xs with
| [] -> []
| _ -> let (rem, left) = removeAt (rnd.Next( List.length xs ) + 1) xs
let next = subset (List.of_seq left)
if rnd.Next(2) = 0 then rem :: next else next
Do you mean a random subset of any size?
For the case of a random subset of a specific size, there's a very elegant answer here:
Select N random elements from a List<T> in C#
Here it is in pseudocode:
RandomKSubset(list, k):
n = len(list)
needed = k
result = {}
for i = 0 to n:
if rand() < needed / (n-i)
push(list[i], result)
needed--
return result
Using Seq.fold to construct using lazy evaluation random sub-set:
let rnd = new Random()
let subset2 xs = let insertAt n xs x = Seq.concat [Seq.take n xs; seq [x]; Seq.skip n xs]
let randomInsert xs = insertAt (rnd.Next( (Seq.length xs) + 1 )) xs
xs |> Seq.fold randomInsert Seq.empty |> Seq.take (rnd.Next( Seq.length xs ) + 1)

Resources