F# folding gracefully without the first or last element? - f#

Say you have a list of strings: [ "a"; "b"; "c" ] and you want to transform it to a single string like so: "a,b,c" notice that the last comma is missing.
I find this case comes up again and again for me, think of all the programming languages that do not allow the trailing comma to be there, and you are building some kind of code generator.
I usually end up with something like:
let listOfThings = ["a";"b";"c"]
let folded =
listOfThings
|> List.map (fun i -> i + ",")
|> List.fold (+) ""
|> (fun s -> s.Substring(0, s.Length - 1))
I feel like there is some fold like function already because this seems such a basic use case, and I just can't figure out what would be it's name, or by what name to search for it.

A fold applies your folding function recursively over all values of the list, starting with an initial state, which you don't particularly want in this case.
It's simpler to use a reduce which uses the list's head as its starting state:
listOfThings |> List.reduce (fun sum cur -> sum + "," + cur) // "a,b,c"
A minor drawback is that since it uses the list head, calling reduce with an empty list would fail. You can mitigate that with a check for an empty list.
Without any built-ins, as you have described, we skip the addition of the trailing comma for the last element:
let rec join = function
| [] -> ""
| [x] -> x
| x::xs -> x + "," + join xs
["a"; "b"; "c"] |> join // a,b,c
However, the most efficient method would be to use String.Join which internally uses a StringBuilder, while reduce allocates a new string for every call:
String.Join(",", listOfThings) // "a,b,c"

A reduction applied to the elements of a list is necessarily of the same type as these are. In contrast, the accumulator (also called state) of a fold can be of a different type, which is more versatile. The signatures make it apparent:
val reduce: ('a -> 'a -> 'a) -> 'a list -> 'a
val fold: ('a -> 'b -> 'a) -> 'a -> 'b list -> 'a
A possible approach might consist in provision of a different folding function for the first element of the list (or for the last, in the case of foldBack). Here it is also prudent to check for an empty list, as it is with reduce.
let fold1 folderN folder0 state = function
| [] -> state
| x::xs -> List.fold folderN (folder0 state x) xs
// val fold1 :
// folderN:('a -> 'b -> 'a) ->
// folder0:('a -> 'b -> 'a) -> state:'a -> _arg1:'b list -> 'a
Now we can fold into a list, or even use a StringBuilder:
([], ["a";"b";"c"])
||> fold1
(fun s t -> t::", "::s)
(fun s t -> t::s)
|> List.rev
// val it : string list = ["a"; ", "; "b"; ", "; "c"]
(System.Text.StringBuilder(), ["a";"b";"c"])
||> fold1
(fun s t -> s.Append(", ").Append t)
(fun s t -> s.Append t)
|> string
// val it : string = "a, b, c"

Indeed, there is a built in function that does this:
let s = [ "a"; "b"; "c" ]
String.concat ", " s // "a, b, c"

Related

In F# when should I use List.choose and when to use List.filter

I have just been doing a CodeWars exercise - "create a function that takes a list of non-negative integers and strings and returns a new list with the strings filtered out".
My solution used List.filter and it failed for one of the edge cases. So I looked at their solution and it used List.choose - which seemed to pretty much identical to my version except it converted the result to an option before deciding whether to include it in the new list.
I am confused - please can someone explain when it is best to use 'choose' and when it is best to use 'filter'?
I think you have already observed the essence of the answer: filter allows you to test for a condition, but with choose you can also project your value in the same expression, which would take a separate map if using filter.
Since the problem statement isn't clear (a list cannot contain integer and strings at the same time, except when they are boxed; i.e. the type of the list would be obj list), we can look at both scenarios. Note the additional map functions when using filter.
// List of strings that may contain integer representations
["1"; "abc"; "2"; "def"]
|> List.choose (System.Int32.TryParse >> function
| true, i -> Some i
| _ -> None )
["1"; "abc"; "2"; "def"]
|> List.map System.Int32.TryParse
|> List.filter fst
|> List.map snd
Both expressions return int list = [1; 2].
// List of object that are either of type int or of type string
[box 1; box "abc"; box 2; box "def"]
|> List.choose (function
| :? int as i -> Some i
| _ -> None )
[box 1; box "abc"; box 2; box "def"]
|> List.filter (fun i -> i :? int)
|> List.map unbox<int>
In the case of obj list as input the projection serves to provide the correct result type. That might be done in a different way, e.g. with an annotated let binding.
In the end, the decision between the two is down to your personal preferences.
List.choose is strictly more general than List.filter. You can implement List.filter using only List.choose, but not the other way around. You should use List.choose in place of List.filter only when you can't use the latter because it's simpler and describes your intention more accurately.
You can observe this difference pretty much from the type signatures alone.
List.choose : ('T -> 'U option) -> 'T list -> 'U list
List.filter : ('T -> bool) -> 'T list -> 'T list
List.filter can be implemented with List.choose like this:
let filter : ('T -> bool) -> 'T list -> 'T list =
fun predicate ->
List.choose (fun x -> if predicate x then Some x else None)
List.choose can however be implemented (inefficiently) using List.filter along with List.map and Option.get' (it is in fact calledfilterMap` in many languages and libraries):
let choose : ('T -> 'U option) -> 'T list -> 'U list =
fun f list ->
list
|> List.map f
|> List.filter (fun x -> x <> None)
|> List.map Option.get
Note that Option.get can raise an exception, but won't here because we've filtered out the Nones that would cause that. But because it is unsafe, it's easy to make a mistake and because this implementation is not very efficient, it's nice to have List.choose come out-of-the-box.

SqlProvider recursively composing queries

Using the SqlProvider type provider, I'm trying to do something whereby I recursively fold up a list of 'query criterions',
type Criterion = {
Column : string
Operator : string
Value : string
}
such that the expression tree gets only gets compiled to SQL once, and I don't hit the database multiple times. I've tried a few approaches, the most successful of which is something like this
let rec eval (acc : IQueryable<SourceEntity> option) (qrys : Criterion list) =
match qrys with
|[] -> acc
|x :: xs -> let acc' = let op,valu = translateOpnValu x
match acc with
|Some acc' -> query {
for elem in acc' do
where (elem.GetColumn x.Column op valu)
select elem
} |> Some
|None -> query {
for elem in ctx.Dbo.Source do
where (elem.GetColumn x.Column op valu)
select elem
} |> Some
eval acc' xs
Where the function translateOpnValu is
let translateOpnValu (c:Criterion) =
match c.Operator with
|"%=%" -> (=%), sprintf "%%%s%%" c.Value
|_ -> (=), c.Value
I am getting this excpetion
System.Exception: Unsupported expression. Ensure all server-side objects appear on the left hand side of predicates. The In and Not In operators only support the inline array syntax. InvokeFast(elem.GetColumn("Source Code"), value(FSI_0006+acc'#38-2), "%BEN%")
at Microsoft.FSharp.Linq.RuntimeHelpers.LeafExpressionConverter.EvaluateQuotation(FSharpExpr e)
at Microsoft.FSharp.Linq.QueryModule.EvalNonNestedInner(CanEliminate canElim, FSharpExpr queryProducingSequence)
at Microsoft.FSharp.Linq.QueryModule.EvalNonNestedOuter(CanEliminate canElim, FSharpExpr tm)
at Microsoft.FSharp.Linq.QueryModule.clo#1735-1.Microsoft-FSharp-Linq-ForwardDeclarations-IQueryMethods-Execute[a,b](FSharpExpr`1 )
at FSI_0006.evaluate(FSharpOption`1 acc, FSharpList`1 qrys) in F:\code_root\vs2015\F\CAMS\CAMS\scratch.fsx:line 47
at <StartupCode$FSI_0007>.$FSI_0007.main#() in F:\code_root\vs2015\F\CAMS\CAMS\scratch.fsx:line 60
If I replace the 'op' returned from translateOpnValu with an implicit operator (= / =%), it works fine.
I have a feeling it is to do with the fact the type of the operator returned is getting constrained to (string -> string -> bool), whereas the implicit operators are more generic. How could I get the translateOpnValu function to return more generic operators ? Or perhaps that's not the problem at all ...
#Fyodor is right -- for the SQL provider to pick up your function properly, you need to wrap it in a quotation and splice it into the query expression. Something like this should work:
let translateOpnValu (c:Criterion) =
match c.Operator with
|"%=%" -> <# (=%) #>, sprintf "%%%s%%" c.Value
|_ -> <# (=) #>, c.Value
// ...
query {
for elem in acc' do
where ((%op) (elem.GetColumn x.Column) valu)
select elem
}

fold or choose till None?

Is there already a way to do something like a chooseTill or a foldTill, where it will process until a None option is received? Really, any of the higher order functions with a "till" option. Granted, it makes no sense for stuff like map, but I find I need this kind of thing pretty often and I wanted to make sure I wasn't reinventing the wheel.
In general, it'd be pretty easy to write something like this, but I'm curious if there is already a way to do this, or if this exists in some known library?
let chooseTill predicate (sequence:seq<'a>) =
seq {
let finished = ref false
for elem in sequence do
if not !finished then
match predicate elem with
| Some(x) -> yield x
| None -> finished := true
}
let foldTill predicate seed list =
let rec foldTill' acc = function
| [] -> acc
| (h::t) -> match predicate acc h with
| Some(x) -> foldTill' x t
| None -> acc
foldTill' seed list
let (++) a b = a.ToString() + b.ToString()
let abcdef = foldTill (fun acc v ->
if Char.IsWhiteSpace v then None
else Some(acc ++ v)) "" ("abcdef ghi" |> Seq.toList)
// result is "abcdef"
I think you can get that easily by combining Seq.scan and Seq.takeWhile:
open System
"abcdef ghi"
|> Seq.scan (fun (_, state) c -> c, (string c) + state) ('x', "")
|> Seq.takeWhile (fst >> Char.IsWhiteSpace >> not)
|> Seq.last |> snd
The idea is that Seq.scan is doing something like Seq.fold, but instead of waiting for the final result, it yields the intermediate states as it goes. You can then keep taking the intermediate states until you reach the end. In the above example, the state is the current character and the concatenated string (so that we can check if the character was whitespace).
A more general version based on a function that returns option could look like this:
let foldWhile f initial input =
// Generate sequence of all intermediate states
input |> Seq.scan (fun stateOpt inp ->
// If the current state is not 'None', then calculate a new one
// if 'f' returns 'None' then the overall result will be 'None'
stateOpt |> Option.bind (fun state -> f state inp)) (Some initial)
// Take only 'Some' states and get the last one
|> Seq.takeWhile Option.isSome
|> Seq.last |> Option.get

Confused with F# List.Fold (powerset function)

I understand and wrote a typical power set function in F# (similar to the Algorithms section in Wikipedia)
Later I found this implementation of powerset which seems nice and compact, expect that I do not understand it.
let rec powerset = function
| [] -> [[]]
| h::t -> List.fold (fun xs t -> (h::t)::t::xs) [] (powerset t);
I broke this down to a 1 step non-recursive function to find the powerset of [1;2] and hardcoded the value of power set of 2 at the end [[2]; []]
let right = function
| [] -> [[]]
| h::t -> List.fold (fun acc t -> (h::t)::t::acc) [] [[2]; []];
The output is [[1]; []; [1; 2]; [2]] which is correct.
However I was expecting List.Fold to output [[1; 2]; []; [1; 2]; [2]].
Since I was not certain about the 't', I modified the variable names, and I did get what I had expected. Of course this is not the correct powerset of [1;2].
let wrong = function
| [] -> [[]]
| h::t -> List.fold (fun acc data -> (h::t)::data::acc) [] [[2]; []];
For me 't' (the one withing fun and not the h::t) is simply a name for the second argument to 'fun' but that is obviously not the case. So what is the difference in the "right" and "wrong" F# functions I have written ? And what exactly does 't' here refer to ?
Thank you ! (I am new to F#)
In your "right" example, t is originally the name of the value bound in the pattern match, but it is hidden by the parameter t in the lambda expression passed to List.fold. Whereas in your "wrong" example, t is captured as a closure in the lambda expression. I think maybe you don't intend this capture, instead you want:
//now it works as you expect, replaced "t" with "data" in your lambda expression.
let wrong = function
| [] -> [[]]
| h::t -> List.fold (fun acc data -> (h::data)::data::acc) [] [[2]; []];
let rec powerset = function
| [] -> [[]]
| h::t -> List.fold (fun xs t -> (h::t)::t::xs) [] (powerset t);
here is the understanding/english translation of the code:
if the list (you want to power) is empty, then return a list, which contains an empty list in it
if the list is h::t (with head h and the rest as t, so h is an element and t is a list). then:
A. (powerset t): calculate the power set of t
B. (fun xs t -> (h::t)::t::xs) means that you apply/fold this function to the (powerset t). more details: xs is an accumulator, it is initialized to []. xxx::xs means you add something to an existing powerest xs. Here xxx is (h::t)::t, which are two elements to be added to the head of xs. (h::t) means add head to t and t means each element in (powerset t). <- the confusing part lies in t, the t in (powerset t) is the rest of the list, while the other t means an element in (powerset t).
here is an imperative translation of the fold function :
let h::t = list
let setfort = powerset t
xs <- []
foreach s in setfort do
xs <- xs.add(t) // t is a valid subset of list
xs <- xs.add(h::t) // t with h is also a valid subset of list
t is a variable bound by pattern matching. List.fold is a fancy way of avoiding explicit looping. Now, go and read some introductory tutorials about F#.

F# Split list into sublists based on comparison of adjacent elements

I've found this question on hubFS, but that handles a splitting criteria based on individual elements. I'd like to split based on a comparison of adjacent elements, so the type would look like this:
val split = ('T -> 'T -> bool) -> 'T list -> 'T list list
Currently, I am trying to start from Don's imperative solution, but I can't work out how to initialize and use a 'prev' value for comparison. Is fold a better way to go?
//Don's solution for single criteria, copied from hubFS
let SequencesStartingWith n (s:seq<_>) =
seq { use ie = s.GetEnumerator()
let acc = new ResizeArray<_>()
while ie.MoveNext() do
let x = ie.Current
if x = n && acc.Count > 0 then
yield ResizeArray.to_list acc
acc.Clear()
acc.Add x
if acc.Count > 0 then
yield ResizeArray.to_list acc }
This is an interesting problem! I needed to implement exactly this in C# just recently for my article about grouping (because the type signature of the function is pretty similar to groupBy, so it can be used in LINQ query as the group by clause). The C# implementation was quite ugly though.
Anyway, there must be a way to express this function using some simple primitives. It just seems that the F# library doesn't provide any functions that fit for this purpose. I was able to come up with two functions that seem to be generally useful and can be combined together to solve this problem, so here they are:
// Splits a list into two lists using the specified function
// The list is split between two elements for which 'f' returns 'true'
let splitAt f list =
let rec splitAtAux acc list =
match list with
| x::y::ys when f x y -> List.rev (x::acc), y::ys
| x::xs -> splitAtAux (x::acc) xs
| [] -> (List.rev acc), []
splitAtAux [] list
val splitAt : ('a -> 'a -> bool) -> 'a list -> 'a list * 'a list
This is similar to what we want to achieve, but it splits the list only in two pieces (which is a simpler case than splitting the list multiple times). Then we'll need to repeat this operation, which can be done using this function:
// Repeatedly uses 'f' to take several elements of the input list and
// aggregate them into value of type 'b until the remaining list
// (second value returned by 'f') is empty
let foldUntilEmpty f list =
let rec foldUntilEmptyAux acc list =
match f list with
| l, [] -> l::acc |> List.rev
| l, rest -> foldUntilEmptyAux (l::acc) rest
foldUntilEmptyAux [] list
val foldUntilEmpty : ('a list -> 'b * 'a list) -> 'a list -> 'b list
Now we can repeatedly apply splitAt (with some predicate specified as the first argument) on the input list using foldUntilEmpty, which gives us the function we wanted:
let splitAtEvery f list = foldUntilEmpty (splitAt f) list
splitAtEvery (<>) [ 1; 1; 1; 2; 2; 3; 3; 3; 3 ];;
val it : int list list = [[1; 1; 1]; [2; 2]; [3; 3; 3; 3]]
I think that the last step is really nice :-). The first two functions are quite straightforward and may be useful for other things, although they are not as general as functions from the F# core library.
How about:
let splitOn test lst =
List.foldBack (fun el lst ->
match lst with
| [] -> [[el]]
| (x::xs)::ys when not (test el x) -> (el::(x::xs))::ys
| _ -> [el]::lst
) lst []
the foldBack removes the need to reverse the list.
Having thought about this a bit further, I've come up with this solution. I'm not sure that it's very readable (except for me who wrote it).
UPDATE Building on the better matching example in Tomas's answer, here's an improved version which removes the 'code smell' (see edits for previous version), and is slightly more readable (says me).
It still breaks on this (splitOn (<>) []), because of the dreaded value restriction error, but I think that might be inevitable.
(EDIT: Corrected bug spotted by Johan Kullbom, now works correctly for [1;1;2;3]. The problem was eating two elements directly in the first match, this meant I missed a comparison/check.)
//Function for splitting list into list of lists based on comparison of adjacent elements
let splitOn test lst =
let rec loop lst inner outer = //inner=current sublist, outer=list of sublists
match lst with
| x::y::ys when test x y -> loop (y::ys) [] (List.rev (x::inner) :: outer)
| x::xs -> loop xs (x::inner) outer
| _ -> List.rev ((List.rev inner) :: outer)
loop lst [] []
splitOn (fun a b -> b - a > 1) [1]
> val it : [[1]]
splitOn (fun a b -> b - a > 1) [1;3]
> val it : [[1]; [3]]
splitOn (fun a b -> b - a > 1) [1;2;3;4;6;7;8;9;11;12;13;14;15;16;18;19;21]
> val it : [[1; 2; 3; 4]; [6; 7; 8; 9]; [11; 12; 13; 14; 15; 16]; [18; 19]; [21]]
Any thoughts on this, or the partial solution in my question?
"adjacent" immediately makes me think of Seq.pairwise.
let splitAt pred xs =
if Seq.isEmpty xs then
[]
else
xs
|> Seq.pairwise
|> Seq.fold (fun (curr :: rest as lists) (i, j) -> if pred i j then [j] :: lists else (j :: curr) :: rest) [[Seq.head xs]]
|> List.rev
|> List.map List.rev
Example:
[1;1;2;3;3;3;2;1;2;2]
|> splitAt (>)
Gives:
[[1; 1; 2; 3; 3; 3]; [2]; [1; 2; 2]]
I would prefer using List.fold over explicit recursion.
let splitOn pred = function
| [] -> []
| hd :: tl ->
let (outer, inner, _) =
List.fold (fun (outer, inner, prev) curr ->
if pred prev curr
then (List.rev inner) :: outer, [curr], curr
else outer, curr :: inner, curr)
([], [hd], hd)
tl
List.rev ((List.rev inner) :: outer)
I like answers provided by #Joh and #Johan as these solutions seem to be most idiomatic and straightforward. I also like an idea suggested by #Shooton. However, each solution had their own drawbacks.
I was trying to avoid:
Reversing lists
Unsplitting and joining back the temporary results
Complex match instructions
Even Seq.pairwise appeared to be redundant
Checking list for emptiness can be removed in cost of using Unchecked.defaultof<_> below
Here's my version:
let splitWhen f src =
if List.isEmpty src then [] else
src
|> List.foldBack
(fun el (prev, current, rest) ->
if f el prev
then el , [el] , current :: rest
else el , el :: current , rest
)
<| (List.head src, [], []) // Initial value does not matter, dislike using Unchecked.defaultof<_>
|> fun (_, current, rest) -> current :: rest // Merge temporary lists
|> List.filter (not << List.isEmpty) // Drop tail element

Resources