insertAt in F# simpler and/or better - f#

I would like to start some questions about simplifying different expressions in F#.
Anyone have ideas for better and/or simpler implementation of insertAt (parameters could be reordered, too). Lists or Sequences could be used.
Here is some start implementation:
let insertAt x xs n = Seq.concat [Seq.take n xs; seq [x]; Seq.skip n xs]

The implementation dannyasher posted is a non-tail-recursive one. In order to make the function more efficient, we'll have to introduce an explicit accumulator parameter which makes the function tail-recursive and allows the compiler to optimize the recursion overhead away:
let insertAt =
let rec insertAtRec acc n e list =
match n, list with
| 0, _ -> (List.rev acc) # [e] # list
| _, x::xs -> insertAtRec (x::acc) (n - 1) e xs
| _ -> failwith "Index out of range"
insertAtRec []

Tail-recursive using Seqs:
let rec insertAt = function
| 0, x, xs -> seq { yield x; yield! xs }
| n, x, xs -> seq { yield Seq.hd xs; yield! insertAt (n-1, x, Seq.skip 1 xs) }

Here's an F# implementation of the Haskell list insertion:
let rec insertAt x ys n =
match n, ys with
| 1, _
| _, [] -> x::ys
| _, y::ys -> y::insertAt x ys (n-1)
let a = [1 .. 5]
let b = insertAt 0 a 3
let c = insertAt 0 [] 3
>
val a : int list = [1; 2; 3; 4; 5]
val b : int list = [1; 2; 0; 3; 4; 5]
val c : int list = [0]
My Haskell isn't good enough to know whether the case of passing an empty list is correctly taken care of in the Haskell function. In F# we explicitly take care of the empty list in the second match case.
Danny

For case you really want to work with sequence:
let insertAt x ys n =
let i = ref n
seq {
for y in ys do
decr i
if !i = 0 then yield x
yield y
}
For all other cases dannyasher's answer is definitly nicer and faster.

From the Haskell Wiki - http://www.haskell.org/haskellwiki/99_questions/21_to_28
insertAt :: a -> [a] -> Int -> [a]
insertAt x ys 1 = x:ys
insertAt x (y:ys) n = y:insertAt x ys (n-1)
I'm not an F# programmer so I don't know the equivalent syntax for F# but this is a nice recursive definition for insertAt

Related

F# get set of subsets containing k elements

Given a set with n elements {1, 2, 3, ..., n}, I want to declare a function which returns the set containing the sets with k number of elements such as:
allSubsets 3 2
Would return [[1;2];[1;3];[2;3]] since those are the sets with 2 elements in a set created by 1 .. n
I've made the initial create-a-set-part but I'm a little stuck on how to find out all the subsets with k elements in it.
let allSubsets n k =
Set.ofList [1..n] |>
UPDATE:
I managed to get a working solution using yield:
let allSubsets n k =
let setN = Set.ofList [1..n]
let rec subsets s =
set [
if Set.count s = k then yield s
for e in s do
yield! subsets (Set.remove e s) ]
subsets setN
allSubsets 3 2
val it : Set<Set<int>> = set [set [1; 2]; set [1; 3]; set [2; 3]]
But isn't it possible to do it a little cleaner?
What you have is pretty clean, but it's also pretty inefficient. Try running allSubsets 10 3 and you'll know what I mean.
This is what I came up with:
let input = Set.ofList [ 1 .. 15 ]
let subsets (size:int) (input: Set<'a>) =
let rec inner elems =
match elems with
| [] -> [[]]
| h::t ->
List.fold (fun acc e ->
if List.length e < size then
(h::e)::e::acc
else e::acc) [] (inner t)
inner (Set.toList input)
|> Seq.choose (fun subset ->
if List.length subset = size then
Some <| Set.ofList subset
else None)
|> Set.ofSeq
subsets 3 input
The inner recursive function is a modified power set function from here. My first hunch was to generate the power set and then filter it, which would be pretty elegant, but that proved to be rather inefficient as well.
If this was to be production-quality code, I'd look into generating lists of indices of a given length, and use them to index into the input array. This is how FsCheck generates subsets, for example.
You can calculate the powerset and then filter in order to get only the ones with the specified length":
let powerset n k =
let lst = Set.toList n
seq [0..(lst.Length |> pown 2)-1]
|> Seq.map (fun i ->
set ([0..lst.Length-1] |> Seq.choose (fun x ->
if i &&& (pown 2 x) = 0 then None else Some lst.[x])))
|> Seq.filter (Seq.length >> (=) k)
However this is not efficient for large sets (n) of where k is close to n. But it's easy to optimize, you'll have to filter out early based on the digit count of the binary representation of each number.
This function implements the popular n-choose-k function:
let n_choose_k (arr: 'a []) (k: int) : 'a list list =
let len = Array.length arr
let rec choose lo x =
match x with
| 0 -> [[]]
| i -> [ for j in lo..(len-1) do
for ks in choose (j+1) (i-1) do
yield arr.[j]::ks ]
choose 0 k
> n_choose_k [|1..3|] 2;;
val it : int list list = [[1; 2]; [1; 3]; [2; 3]]
You can use Set.toArray and Set.ofList to convert to and from Set.
You can consider the following approach:
get powerset
let rec powerset xs =
match xs with
| [] -> [ [] ]
| h :: t -> List.fold (fun ys s -> (h :: s) :: s :: ys) [] (powerset t)
filter all subsets with a neccessary number of elements
let filtered xs k = List.filter (fun (x: 'a list) -> x.Length = k) xs
finally get the requested allSubsets
let allSubsets n k = Set.ofList (List.map (fun xs -> Set.ofList xs) (filtered (powerset [ 1 .. n ]) k))
Just to check and play with you can use:
printfn "%A" (allSubsets 3 2) // set [ set [1; 2]; set [1; 3]; set [2; 3] ]

Finding the Maximum element in a list with pattern matching and recursion F#

I'm trying to find the maximum element in a list without using List.Max for a school assignment using the below given template.
let findMax l =
let rec helper(l,m) = failwith "Not implemented"
match l with
| [] -> failwith "Error -- empty list"
| (x::xs) -> helper(xs,x)
The only solution to the problem I can think of, atm is
let rec max_value1 l =
match l with
|[] -> failwith "Empty List"
|[x] -> x
|(x::y::xs) -> if x<y then max_value1 (y::xs)
else max_value1 (x::xs)
max_value1 [1; 17; 3; 6; 1; 8; 3; 11; 6; 5; 9];;
Is there any way I can go from the function I built to one that uses the template? Thanks!
Your helper function should do the work, the outer function just validates that the list is not empty and if it's not, calls the helper, which should be something like this:
let rec helper (l,m) =
match (l, m) with
| [] , m -> m
| x::xs, m -> helper (xs, max m x)
Note, that you since you're matching against the last argument of the function you can remove it and use function instead of match with:
let rec helper = function
| [] , m -> m
| x::xs, m -> helper (xs, max m x)
let findMax l =
let rec helper(l,m) =
match l with
| [] -> m
| (x::xs) -> helper(xs, if (Some x > m) then Some x else m)
helper (l,None)
Example:
[-2;-6;-1;-9;-56;-3] |> findMax
val it : int option = Some -1
An empty list will return None.
You could go for a tuple to pass both, or simply apply the helper function in your main match (instead of the empty list guard clause). I'm including the answer for someone who might find this question in the future and not have a clear answer.
let findMax l =
let rec walk maxValue = function
| [] -> maxValue
| (x::xs) -> walk (if x > maxValue then x else maxValue) xs
match l with
| [] -> failwith "Empty list"
| (head::tail) -> walk head tail
findMax [1; 12; 3; ] //12
Using fold:
let findMax l = l |> List.fold (fun maxValue x -> if x > maxValue then x else maxValue) (List.head l)
I am not sure of what the exact rules of your assigment are but the max of a list is really just List.reduce max. So
let listMax : int list -> int = List.reduce max
You need the type annotation to please the typechecker.
let inline listMax xs = List.reduce max xs
also works and is generic so it works with e.g. floats and strings as well.

Folding a list in F#

I have a pretty trivial task but I can't figure out how to make the solution prettier.
The goal is taking a List and returning results, based on whether they passed a predicate. The results should be grouped. Here's a simplified example:
Predicate: isEven
Inp : [2; 4; 3; 7; 6; 10; 4; 5]
Out: [[^^^^]......[^^^^^^^^]..]
Here's the code I have so far:
let f p ls =
List.foldBack
(fun el (xs, ys) -> if p el then (el::xs, ys) else ([], xs::ys))
ls ([], [])
|> List.Cons // (1)
|> List.filter (not << List.isEmpty) // (2)
let even x = x % 2 = 0
let ret =
[2; 4; 3; 7; 6; 10; 4; 5]
|> f even
// expected [[2; 4]; [6; 10; 4]]
This code does not seem to be readable that much. Also, I don't like lines (1) and (2). Is there any better solution?
Here is my take. you need a few helper functions first:
// active pattern to choose between even and odd intengers
let (|Even|Odd|) x = if (x % 2) = 0 then Even x else Odd x
// fold function to generate a state tupple of current values and accumulated values
let folder (current, result) x =
match x, current with
| Even x, _ -> x::current, result // even members a added to current list
| Odd x, [] -> current, result // odd members are ignored when current is empty
| Odd x, _ -> [], current::result // odd members starts a new current
// test on data
[2; 4; 3; 7; 6; 10; 4; 5]
|> List.rev // reverse list since numbers are added to start of current
|> List.fold folder ([], []) // perform fold over list
|> function | [],x -> x | y,x -> y::x // check that current is List.empty, otherwise add to result
How about this one?
let folder p l = function
| h::t when p(l) -> (l::h)::t
| []::_ as a -> a
| _ as a -> []::a
let f p ls =
ls
|> List.rev
|> List.fold (fun a l -> folder p l a) [[]]
|> List.filter ((<>) [])
At least the folder is crystal clear and effective, but then you pay the price for this by list reversing.
Here is a recursive solution based on a recursive List.filter
let rec _f p ls =
match ls with
|h::t -> if p(h) then
match f p t with
|rh::rt -> (h::rh)::rt
|[] -> (h::[])::[]
else []::f p t
|[] -> [[]]
let f p ls = _f p ls |> List.filter (fun t -> t <> [])
Having to filter at the end does seem inelegant though.
Here you go. This function should also have fairly good performance.
let groupedFilter (predicate : 'T -> bool) (list : 'T list) =
(([], []), list)
||> List.fold (fun (currentGroup, finishedGroups) el ->
if predicate el then
(el :: currentGroup), finishedGroups
else
match currentGroup with
| [] ->
[], finishedGroups
| _ ->
// This is the first non-matching element
// following a matching element.
// Finish processing the previous group then
// add it to the finished groups list.
[], ((List.rev currentGroup) :: finishedGroups))
// Need to do a little clean-up after the fold.
|> fun (currentGroup, finishedGroups) ->
// If the current group is non-empty, finish it
// and add it to the list of finished groups.
let finishedGroups =
match currentGroup with
| [] -> finishedGroups
| _ ->
(List.rev currentGroup) :: finishedGroups
// Reverse the finished groups list so the grouped
// elements will be in their original order.
List.rev finishedGroups;;
With the list reversing, I would like to go to #seq instead of list.
This version uses mutation (gasp!) internally for efficiency, but may also be a little slower with the overhead of seq. I think it is quite readable though.
let f p (ls) = seq {
let l = System.Collections.Generic.List<'a>()
for el in ls do
if p el then
l.Add el
else
if l.Count > 0 then yield l |> List.ofSeq
l.Clear()
if l.Count > 0 then yield l |> List.ofSeq
}
I can't think of a way to do this elegantly using higher order functions, but here's a solution using a list comprehension. I think it's fairly straightforward to read.
let f p ls =
let rec loop xs =
[ match xs with
| [] -> ()
| x::xs when p x ->
let group, rest = collectGroup [x] xs
yield group
yield! loop rest
| _::xs -> yield! loop xs ]
and collectGroup acc = function
| x::xs when p x -> collectGroup (x::acc) xs
| xs -> List.rev acc, xs
loop ls

More volatile sequence than "classical"

For cartesian production there is a good enough function - sequence which defined like that:
let rec sequence = function
| [] -> Seq.singleton []
| (l::ls) -> seq { for x in l do for xs in sequence ls do yield (x::xs) }
but look at its result:
sequence [[1..2];[1..10000]] |> Seq.skip 1000 ;;
val it : seq = seq [[1; 1001]; [1; 1002]; [1; 1003]; [1; 1004]; ...]
As we can see the first "coordinate" of the product alters very slowly and it will change the value when the second list is ended.
I wrote my own sequence as following (comments below):
/// Sum of all producted indeces = n
let rec hyper'plane'indices indexsum maxlengths =
match maxlengths with
| [x] -> if indexsum < x then [[indexsum]] else []
| (i::is) -> [for x in [0 .. min indexsum (i-1)] do for xs in hyper'plane'indices (indexsum-x) is do yield (x::xs)]
| [] -> [[]]
let finite'sequence = function
| [] -> Seq.singleton []
| ns ->
let ars = [ for n in ns -> Seq.toArray n ]
let length'list = List.map Array.length ars
let nmax = List.max length'list
seq {
for n in [0 .. nmax] do
for ixs in hyper'plane'indices n length'list do
yield (List.map2 (fun (a:'a[]) i -> a.[i]) ars ixs)
}
The key idea is to look at (two) lists as at (two) orthogonal dimensions where every element marked by its index in the list. So we can enumerate all elements by enumerating every element in every section of cartesian product by hyper plane (in 2D case this is a line). In another words imagine excel's sheet where first column contains values from [1;1] to [1;10000] and second - from [2;1] to [2;10000]. And "hyper plane" with number 1 is the line that connects cell A2 and cell B1. For the our example
hyper'plane'indices 0 [2;10000];; val it : int list list = [[0; 0]]
hyper'plane'indices 1 [2;10000];; val it : int list list = [[0; 1]; [1; 0]]
hyper'plane'indices 2 [2;10000];; val it : int list list = [[0; 2]; [1; 1]]
hyper'plane'indices 3 [2;10000];; val it : int list list = [[0; 3]; [1; 2]]
hyper'plane'indices 4 [2;10000];; val it : int list list = [[0; 4]; [1; 3]]
Well if we have indeces and arrays that we are producing from the given lists than we can now define sequence as {all elements in plane 0; than all elements in plane 1 ... and so on } and get more volatile function than original sequence.
But finite'sequence turned out very gluttonous function. And now the question. How I can improve it?
With best wishes, Alexander. (and sorry for poor English)
Can you explain what exactly is the problem - time or space complexity or performance? Do you have a specific benchmark in mind? I am not sure how to improve on the time complexity here, but I edited your code a bit to remove the intermediate lists, which might help a bit with memory allocation behavior.
Do not do this:
for n in [0 .. nmax] do
Do this instead:
for n in 0 .. nmax do
Here is the code:
let rec hyper'plane'indices indexsum maxlengths =
match maxlengths with
| [] -> Seq.singleton []
| [x] -> if indexsum < x then Seq.singleton [indexsum] else Seq.empty
| i :: is ->
seq {
for x in 0 .. min indexsum (i - 1) do
for xs in hyper'plane'indices (indexsum - x) is do
yield x :: xs
}
let finite'sequence xs =
match xs with
| [] -> Seq.singleton []
| ns ->
let ars = [ for n in ns -> Seq.toArray n ]
let length'list = List.map Array.length ars
let nmax = List.max length'list
seq {
for n in 0 .. nmax do
for ixs in hyper'plane'indices n length'list do
yield List.map2 Array.get ars ixs
}
Does this fare any better? Beautiful problem by the way.
UPDATE: Perhaps you are more interested to mix the sequences fairly than in maintaining the exact formula in your algorithm. Here is a Haskell code that mixes a finite number of possibly infinite sequences fairly, where fairness means that for every input element there is a finite prefix of the output sequence that contains it. You mention in the comment that you have a 2D incremental solution that is hard to generalize to N dimensions, and the Haskell code does exactly that:
merge :: [a] -> [a] -> [a]
merge [] y = y
merge x [] = x
merge (x:xs) (y:ys) = x : y : merge xs ys
prod :: (a -> b -> c) -> [a] -> [b] -> [c]
prod _ [] _ = []
prod _ _ [] = []
prod f (x:xs) (y:ys) = f x y : a `merge` b `merge` prod f xs ys where
a = [f x y | x <- xs]
b = [f x y | y <- ys]
prodN :: [[a]] -> [[a]]
prodN [] = [[]]
prodN (x:xs) = prod (:) x (prodN xs)
I have not ported this to F# yet - it requires some thought as sequences do not match to head/tail very well.
UPDATE 2:
A fairly mechanical translation to F# follows.
type Node<'T> =
| Nil
| Cons of 'T * Stream<'T>
and Stream<'T> = Lazy<Node<'T>>
let ( !! ) (x: Lazy<'T>) = x.Value
let ( !^ ) x = Lazy.CreateFromValue(x)
let rec merge (xs: Stream<'T>) (ys: Stream<'T>) : Stream<'T> =
lazy
match !!xs, !!ys with
| Nil, r | r, Nil -> r
| Cons (x, xs), Cons (y, ys) -> Cons (x, !^ (Cons (y, merge xs ys)))
let rec map (f: 'T1 -> 'T2) (xs: Stream<'T1>) : Stream<'T2> =
lazy
match !!xs with
| Nil -> Nil
| Cons (x, xs) -> Cons (f x, map f xs)
let ( ++ ) = merge
let rec prod f xs ys =
lazy
match !!xs, !!ys with
| Nil, _ | _, Nil -> Nil
| Cons (x, xs), Cons (y, ys) ->
let a = map (fun x -> f x y) xs
let b = map (fun y -> f x y) ys
Cons (f x y, a ++ b ++ prod f xs ys)
let ofSeq (s: seq<'T>) =
lazy
let e = s.GetEnumerator()
let rec loop () =
lazy
if e.MoveNext()
then Cons (e.Current, loop ())
else e.Dispose(); Nil
!! (loop ())
let toSeq stream =
stream
|> Seq.unfold (fun stream ->
match !!stream with
| Nil -> None
| Cons (x, xs) -> Some (x, xs))
let empty<'T> : Stream<'T> = !^ Nil
let cons x xs = !^ (Cons (x, xs))
let singleton x = cons x empty
let rec prodN (xs: Stream<Stream<'T>>) : Stream<Stream<'T>> =
match !!xs with
| Nil -> singleton empty
| Cons (x, xs) -> prod cons x (prodN xs)
let test () =
ofSeq [
ofSeq [1; 2; 3]
ofSeq [4; 5; 6]
ofSeq [7; 8; 9]
]
|> prodN
|> toSeq
|> Seq.iter (fun xs ->
toSeq xs
|> Seq.map string
|> String.concat ", "
|> stdout.WriteLine)

Why is this F# sequence function not tail recursive?

Disclosure: this came up in FsCheck, an F# random testing framework I maintain. I have a solution, but I do not like it. Moreover, I do not understand the problem - it was merely circumvented.
A fairly standard implementation of (monadic, if we're going to use big words) sequence is:
let sequence l =
let k m m' = gen { let! x = m
let! xs = m'
return (x::xs) }
List.foldBack k l (gen { return [] })
Where gen can be replaced by a computation builder of choice. Unfortunately, that implementation consumes stack space, and so eventually stack overflows if the list is long enough.The question is: why? I know in principle foldBack is not tail recursive, but the clever bunnies of the F# team have circumvented that in the foldBack implementation. Is there a problem in the computation builder implementation?
If I change the implementation to the below, everything is fine:
let sequence l =
let rec go gs acc size r0 =
match gs with
| [] -> List.rev acc
| (Gen g)::gs' ->
let r1,r2 = split r0
let y = g size r1
go gs' (y::acc) size r2
Gen(fun n r -> go l [] n r)
For completeness, the Gen type and computation builder can be found in the FsCheck source
Building on Tomas's answer, let's define two modules:
module Kurt =
type Gen<'a> = Gen of (int -> 'a)
let unit x = Gen (fun _ -> x)
let bind k (Gen m) =
Gen (fun n ->
let (Gen m') = k (m n)
m' n)
type GenBuilder() =
member x.Return(v) = unit v
member x.Bind(v,f) = bind f v
let gen = GenBuilder()
module Tomas =
type Gen<'a> = Gen of (int -> ('a -> unit) -> unit)
let unit x = Gen (fun _ f -> f x)
let bind k (Gen m) =
Gen (fun n f ->
m n (fun r ->
let (Gen m') = k r
m' n f))
type GenBuilder() =
member x.Return v = unit v
member x.Bind(v,f) = bind f v
let gen = GenBuilder()
To simplify things a bit, let's rewrite your original sequence function as
let rec sequence = function
| [] -> gen { return [] }
| m::ms -> gen {
let! x = m
let! xs = sequence ms
return x::xs }
Now, sequence [for i in 1 .. 100000 -> unit i] will run to completion regardless of whether sequence is defined in terms of Kurt.gen or Tomas.gen. The issue is not that sequence causes a stack overflow when using your definitions, it's that the function returned from the call to sequence causes a stack overflow when it is called.
To see why this is so, let's expand the definition of sequence in terms of the underlying monadic operations:
let rec sequence = function
| [] -> unit []
| m::ms ->
bind (fun x -> bind (fun xs -> unit (x::xs)) (sequence ms)) m
Inlining the Kurt.unit and Kurt.bind values and simplifying like crazy, we get
let rec sequence = function
| [] -> Kurt.Gen(fun _ -> [])
| (Kurt.Gen m)::ms ->
Kurt.Gen(fun n ->
let (Kurt.Gen ms') = sequence ms
(m n)::(ms' n))
Now it's hopefully clear why calling let (Kurt.Gen f) = sequence [for i in 1 .. 1000000 -> unit i] in f 0 overflows the stack: f requires a non-tail-recursive call to sequence and evaluation of the resulting function, so there will be one stack frame for each recursive call.
Inlining Tomas.unit and Tomas.bind into the definition of sequence instead, we get the following simplified version:
let rec sequence = function
| [] -> Tomas.Gen (fun _ f -> f [])
| (Tomas.Gen m)::ms ->
Tomas.Gen(fun n f ->
m n (fun r ->
let (Tomas.Gen ms') = sequence ms
ms' n (fun rs -> f (r::rs))))
Reasoning about this variant is tricky. You can empirically verify that it won't blow the stack for some arbitrarily large inputs (as Tomas shows in his answer), and you can step through the evaluation to convince yourself of this fact. However, the stack consumption depends on the Gen instances in the list that's passed in, and it is possible to blow the stack for inputs that aren't themselves tail recursive:
// ok
let (Tomas.Gen f) = sequence [for i in 1 .. 1000000 -> unit i]
f 0 (fun list -> printfn "%i" list.Length)
// not ok...
let (Tomas.Gen f) = sequence [for i in 1 .. 1000000 -> Gen(fun _ f -> f i; printfn "%i" i)]
f 0 (fun list -> printfn "%i" list.Length)
You're correct - the reason why you're getting a stack overflow is that the bind operation of the monad needs to be tail-recursive (because it is used to aggregate values during folding).
The monad used in FsCheck is essentially a state monad (it keeps the current generator and some number). I simplified it a bit and got something like:
type Gen<'a> = Gen of (int -> 'a)
let unit x = Gen (fun n -> x)
let bind k (Gen m) =
Gen (fun n ->
let (Gen m') = k (m n)
m' n)
Here, the bind function is not tail-recursive because it calls k and then does some more work. You can change the monad to be a continuation monad. It is implemented as a function that takes the state and a continuation - a function that is called with the result as an argument. For this monad, you can make bind tail recursive:
type Gen<'a> = Gen of (int -> ('a -> unit) -> unit)
let unit x = Gen (fun n f -> f x)
let bind k (Gen m) =
Gen (fun n f ->
m n (fun r ->
let (Gen m') = k r
m' n f))
The following example will not stack overflow (and it did with the original implementation):
let sequence l =
let k m m' =
m |> bind (fun x ->
m' |> bind (fun xs ->
unit (x::xs)))
List.foldBack k l (unit [])
let (Gen f) = sequence [ for i in 1 .. 100000 -> unit i ]
f 0 (fun list -> printfn "%d" list.Length)

Resources