Imposing leads and lags into a sequence without a for loop - f#

Aside from writing a loop that yields values, is there a simple/clean functional way of creating a lag (previous value) within a sequence.
Eg. If my sequence is 1 2 3 4 5 6 7 8 9 10 and my lag is 1 return a tuple that is
(Some(1), None), (Some(2), Some(1)), (Some(3), Some(2))...(Some(10), Some(9))
A lag of 2 would give (Some(1), None), (Some(2), None), (Some(3), Some(1))...
It's obviously easy to write this using a loop, but is that the right way?

One way is to use the functions in the Seq module:
let lag n sequence =
sequence
|> Seq.map Some
|> Seq.append (Seq.init n (fun _ -> None))
|> Seq.zip sequence
lag 2 [1..5] |> Seq.toList
> [(1, null); (2, null); (3, Some 1); (4, Some 2); (5, Some 3)]

petebu's answer (once a few mistakes are corrected) is a better answer. But I'll leave this here anyway.
let withLag n (source: seq<_>) =
source
|> Seq.windowed n
|> Seq.append (Seq.init n (fun _ -> [||]))
|> Seq.zip source
|> Seq.map (fun (x, arr) ->
let laggedValue = if arr.Length > 0 then Some arr.[0] else None
(x, laggedValue))
let l = List.init 5 id
l |> withLag 2 |> Seq.toList |> printfn "%A"
> [(0, null); (1, null); (2, Some 0); (3, Some 1); (4, Some 2)]

Related

Efficient way to test a symmetric function on all pairings of a Seq

Suppose I have a collection like [ "a"; "b"; "c" ] and I want to test every element against every other element.
I could generate all pairs like this:
let combinations xs =
Seq.allPairs xs xs
|> Seq.filter (fun (x, y) -> x <> y)
|> Seq.toList
combinations [ "a"; "b"; "c" ]
// [("a", "b"); ("a", "c"); ("b", "a"); ("b", "c"); ("c", "a"); ("c", "b")]
But for my test, I always know that f x y = f y x (since f is symmetric), so I want to trim the number of combinations tested:
let combinations xs =
Seq.allPairs xs xs
|> Seq.filter (fun (x, y) -> x <> y && x < y)
|> Seq.toList
combinations [ "a"; "b"; "c" ]
// [("a", "b"); ("a", "c"); ("b", "c")]
But this:
Doesn't seem like an efficient way to generate the test cases
Requires that x : comparison, which I don't think should be necessary
How should I implement this in F#?
Don't know about efficient - this looks like you need to cache the pairs already generated and filter on their presence in the cache.
The library implementation of Seq.allPairs goes along these lines:
let allPairs source1 source2 =
source1 |> Seq.collect (fun x -> source2 |> Seq.map (fun y -> x, y))
// val allPairs : source1:seq<'a> -> source2:seq<'b> -> seq<'a * 'b>
Then you integrate the caching and filtering into this, constraining both sequences to type seq<'a> and introducing the equality constraint.
let allPairs1 source1 source2 =
let h = System.Collections.Generic.HashSet()
source1 |> Seq.collect (fun x ->
source2 |> Seq.choose (fun y ->
if x = y || h.Contains (x, y) || h.Contains (y, x) then None
else h.Add (x, y) |> ignore; Some (x, y) ) )
// val allPairs1 :
// source1:seq<'a> -> source2:seq<'a> -> seq<'a * 'a> when 'a : equality
Test
allPairs1 [1..3] [2..4] |> Seq.toList
// val it : (int * int) list = [(1, 2); (1, 3); (1, 4); (2, 3); (2, 4); (3, 4)]
Because f is commutative, the simplest way to get all combinations is to project each item into a pair with the remainder of the list.
let rec combinations = function
| [] -> []
| x::xs -> (xs |> List.map (fun y -> (x, y))) # (combinations xs)
We don't need any comparison constraint.
let xs = [1; 2; 3; 4;]
combinations xs // [(1, 2); (1, 3); (1, 4); (2, 3); (2, 4); (3, 4)]
Checking the results with #kaefer's method:
combinations xs = (allPairs1 xs xs |> Seq.toList) // true
Another solution that assumes all elements are distinct (it uses position as identity):
let allSymmetricPairs xs =
seq {
let xs = Seq.toArray xs
for i = 0 to Array.length xs - 2 do
for j = i + 1 to Array.length xs - 1 do
yield xs.[i], xs.[j]
}
We can also pre-allocate the array, which may be faster if you plan to pull the whole sequence:
let allSymmetricPairs xs =
let xs = Seq.toArray xs
let n = Array.length xs
let result = Array.zeroCreate (n * (n - 1) / 2)
let mutable k = 0
for i = 0 to n - 2 do
for j = i + 1 to n - 1 do
result.[k] <- xs.[i], xs.[j]
k <- k + 1
result

F# convert array to array of tuples

Let's say I have an array
let arr = [|1;2;3;4;5;6|]
I would like to convert it to something like
[|(1,2);(3,4);(5,6)|]
I've seen Seq.window but this one is going to generate something like
[|(1,2);(2,3);(3,4);(4,5);(5,6)|]
which is not what I want
You can use Array.chunkBySize and then map each sub-array into tuples:
let input = [|1..10|]
Array.chunkBySize 2 list |> Array.map (fun xs -> (xs.[0], xs.[1]))
#Slugart's accepted answer is the best approach (IMO) assuming you know that the array has an even number of elements, but here's another approach that doesn't throw an exception if there does happen to be an odd number (it just omits the last trailing element):
let arr = [|1;2;3;4;5|]
seq { for i in 0 .. 2 .. arr.Length - 2 -> (arr.[i], arr.[i+1]) } |> Seq.toArray
You could use Seq.pairwise, as long as you filter out every other tuple. The filtering needs to pass a state through the iteration, which is usually effected by the scan function.
[|1..10|]
|> Seq.pairwise
|> Seq.scan (fun s t ->
match s with None -> Some t | _ -> None )
None
|> Seq.choose id
|> Seq.toArray
// val it : (int * int) [] = [|(1, 2); (3, 4); (5, 6); (7, 8); (9, 10)|]
But then it's also possible to have scan generate the tuples directly, on penalty of an intermediate array.
[|1..10|]
|> Array.scan (function
| Some x, _ -> fun y -> None, Some(x, y)
| _ -> fun x -> Some x, None )
(None, None)
|> Array.choose snd
Use Seq.pairwise to turn a sequence into tuples
[|1;2;3;4;5;6|]
|> Seq.pairwise
|> Seq.toArray
val it : (int * int) [] = [|(1, 2); (2, 3); (3, 4); (4, 5); (5, 6)|]
Should be:
let rec slice =
function
| [] -> []
| a::b::rest -> (a,b) :: slice (rest)
| _::[] -> failwith "cannot slice uneven list"

Does (Array/List/Seq).groupBy maintain sort order within groups?

Does groupBy guarantee that sort order is preserved in code like the following?
x
|> Seq.sortBy (fun (x, y) -> y)
|> Seq.groupBy (fun (x, y) -> x)
By preserving sort order, I mean can we guarantee that within each grouping by x, the result is still sorted by y.
This is true for simple examples,
[(1, 3);(2, 1);(1, 1);(2, 3)]
|> Seq.sortBy (fun (x, y) -> y)
|> Seq.groupBy (fun (x, y) -> x)
// seq [(2, seq [(2, 1); (2, 3)]); (1, seq [(1, 1); (1, 3)])]
I want to make sure there are no weird edge cases.
What do you mean by preserving sort order? Seq.groupBy changes the type of the sequence, so how can you even meaningfully compare before and after?
For a given xs of the type seq<'a * 'b>, the type of the expression xs |> Seq.sortBy snd is seq<'a * 'b>, whereas the type of the expression xs |> Seq.sortBy snd |> Seq.groupBy fst is seq<'a * seq<'a * 'b>>. Thus, whether or not the answer to the question is yes or no depends on what you mean by preserving the sort order.
As #Petr wrote in the comments, it's easy to test this. If you're worried about special cases, write a Property using FsCheck and see if it generalises:
open FsCheck.Xunit
open Swensen.Unquote
[<Property>]
let isSortOrderPreserved (xs : (int * int) list) =
let actual = xs |> Seq.sortBy snd |> Seq.groupBy fst
let expected = xs |> Seq.sortBy snd |> Seq.toList
expected =! (actual |> Seq.map snd |> Seq.concat |> Seq.toList)
In this property, I've interpreted the property of sort order preservation to mean that if you subsequently concatenate the grouped sequences, the sort order is preserved. Your definition may be different.
Given this particular definition, however, running the property clearly demonstrates that the property doesn't hold:
Falsifiable, after 6 tests (13 shrinks) (StdGen (1448745695,296088811)):
Original:
[(-3, -7); (4, -7); (4, 0); (-4, 0); (-4, 7); (3, 7); (3, -1); (-5, -1)]
Shrunk:
[(3, 1); (3, 0); (0, 0)]
---- Swensen.Unquote.AssertionFailedException : Test failed:
[(3, 0); (0, 0); (3, 1)] = [(3, 0); (3, 1); (0, 0)]
false
Here we see that if the input is [(3, 1); (3, 0); (0, 0)], the grouped sequence doesn't preserve the sort order (which isn't surprising to me).
Based on the updated question, here's a property that examines that question:
[<Property(MaxTest = 10000)>]
let isSortOrderPreservedWithEachGroup (xs : (int * int) list) =
let actual = xs |> Seq.sortBy snd |> Seq.groupBy fst
let expected =
actual
|> Seq.map (fun (k, vals) -> k, vals |> Seq.sort |> Seq.toList)
|> Seq.toList
expected =!
(actual |> Seq.map (fun (k, vals) -> k, Seq.toList vals) |> Seq.toList)
This property does, indeed, hold:
Ok, passed 10000 tests.
You should still consider carefully whether you want to rely on behaviour that isn't documented, since it could change in later incarnations of F#. Personally, I'd adopt a piece of advice from the Zen of Python:
Explicit is better than implicit.
BTW, the reason for all that conversion to F# lists is because lists have structural equality, while sequences don't.
The documentation doesn't say explicitly (except through the example), but the implementation does preserve the order of the original sequence. It would be quite surprising if it didn't: the equivalent functions in other languages that I am aware of do.
Who cares. Instead of sorting and then grouping, just group and then sort and the ordering is guaranteed even if the F# implementation of groupBy eventually changes:
x
|> Seq.groupBy (fun (x, y) -> x)
|> Seq.map (fun (k, v) -> k, v |> Seq.sortBy (fun (x, y) -> y))

Problems Creating an Infinite Lazy List

I completed the seventh Euler problem* in F# but am not entirely happy with my
implementation. In the function primes I create a sequence that I estimated would contain the 10,001st prime number. When I tried using Seq.initInfinite to lazily generate the candidate primes my code just hung before throwing an out of memory exception.
Could someone advise me on replacing the literal sequence with a lazily-generated sequence which is short-circuited once the desired prime is found?
let isPrime n =
let bound = int (sqrt (float n))
seq {2 .. bound} |> Seq.forall (fun x -> n % x <> 0)
let primeAsync n =
async { return (n, isPrime n)}
let primes =
{1..1000000}
|> Seq.map primeAsync
|> Async.Parallel
|> Async.RunSynchronously
|> Array.filter snd
|> Array.map fst
|> Array.mapi (fun i el -> (i, el))
|> Array.find (fun (fst, snd) -> fst = 10001)
primes
*"By listing the first six prime numbers: 2, 3, 5, 7, 11, and 13, we can see that the 6th prime is 13. What is the 10,001st prime number?"
I think the problem is/was that Async.RunSynchronously isn't lazy and tried to evaluate the whole infinite sequence. Although there are better solutions for this, your algorithm is fast enough, so you don't even need parallelization; this works perfectly:
open System
let isPrime n =
let bound = n |> float |> sqrt |> int
seq {2 .. bound} |> Seq.forall (fun x -> n % x <> 0)
let prime =
Seq.initInfinite ((+) 2)
|> Seq.filter isPrime
|> Seq.skip 10000
|> Seq.head
The sequence gets 'reified' as soon as you feed it to Async.Parallel. If you want to minimise memory consumption, run the computation serially or split it into lazy chunks, the elements in each chunk to be run in parallel.

Pivot or zip a seq<seq<'a>> in F#

Let's say I have a sequence of sequences, e.g.
{1, 2, 3}, {1, 2, 3}, {1, 2, 3}
What is the best way to pivot or zip this sequence so I instead have,
{1, 1, 1}, {2, 2, 2}, {3, 3, 3}
Is there a comprehensible way of doing so without resorting to manipulation of the underlying IEnumerator<_> type?
To clarify, these are seq<seq<int>> objects. Each sequences (both internal and external) can have any number of items.
If you're going for a solution which is semantically Seq, you're going to have to stay lazy all the time.
let zip seq = seq
|> Seq.collect(fun s -> s |> Seq.mapi(fun i e -> (i, e))) //wrap with index
|> Seq.groupBy(fst) //group by index
|> Seq.map(fun (i, s) -> s |> Seq.map snd) //unwrap
Test:
let seq = Enumerable.Repeat((seq [1; 2; 3]), 3) //don't want to while(true) yield. bleh.
printfn "%A" (zip seq)
Output:
seq [seq [1; 1; 1]; seq [2; 2; 2]; seq [3; 3; 3]]
This seems very inelegant but it gets the right answer:
(seq [(1, 2, 3); (1, 2, 3); (1, 2, 3);])
|> Seq.fold (fun (sa,sb,sc) (a,b,c) ->a::sa,b::sb,c::sc) ([],[],[])
|> fun (a,b,c) -> a::b::c::[]
It looks like matrix transposition.
let data =
seq [
seq [1; 2; 3]
seq [1; 2; 3]
seq [1; 2; 3]
]
let rec transpose = function
| (_::_)::_ as M -> List.map List.head M :: transpose (List.map List.tail M)
| _ -> []
// I don't claim it is very elegant, but no doubt it is readable
let result =
data
|> List.ofSeq
|> List.map List.ofSeq
|> transpose
|> Seq.ofList
|> Seq.map Seq.ofList
Alternatively, you may adopt the same method for seq, thanks to this answer for an elegant Active pattern:
let (|SeqEmpty|SeqCons|) (xs: 'a seq) =
if Seq.isEmpty xs then SeqEmpty
else SeqCons(Seq.head xs, Seq.skip 1 xs)
let rec transposeSeq = function
| SeqCons(SeqCons(_,_),_) as M ->
Seq.append
(Seq.singleton (Seq.map Seq.head M))
(transposeSeq (Seq.map (Seq.skip 1) M))
| _ -> Seq.empty
let resultSeq = data |> transposeSeq
See also this answer for technical details and two references: to PowerPack's Microsoft.FSharp.Math.Matrix and yet another method involving mutable data.
This is the same answer as #Asti, just cleaned up a little:
[[1;2;3]; [1;2;3]; [1;2;3]]
|> Seq.collect Seq.indexed
|> Seq.groupBy fst
|> Seq.map (snd >> Seq.map snd);;

Resources