summing elements from a user defined datatype - f#

Upon covering the predefined datatypes in f# (i.e lists) and how to sum elements of a list or a sequence, I'm trying to learn how I can work with user defined datatypes. Say I create a data type, call it list1:
type list1 =
A
| B of int * list1
Where:
A stands for an empty list
B builds a new list by adding an int in front of another list
so 1,2,3,4, will be represented with the list1 value:
B(1, B(2, B(3, B(4, A))))
From the wikibook I learned that with a list I can sum the elements by doing:
let List.sum [1; 2; 3; 4]
But how do I go about summing the elements of a user defined datatype? Any hints would be greatly appreciated.
Edit: I'm able to take advantage of the match operator:
let rec sumit (l: ilist) : int =
match l with
| (B(x1, A)) -> x1
| (B(x1, B(x2, A))) -> (x1+x2)
sumit (B(3, B(4, A)))
I get:
val it : int = 7
How can I make it so that if I have more than 2 ints it still sums the elemets (i.e. (B(3, B(4, B(5, A)))) gets 12?

One good general approach to questions like this is to write out your algorithm in word form or pseudocode form, then once you've figured out your algorithm, convert it to F#. In this case where you want to sum the lists, that would look like this:
The first step in figuring out an algorithm is to carefully define the specifications of the problem. I want an algorithm to sum my custom list type. What exactly does that mean? Or, to be more specific, what exactly does that mean for the two different kinds of values (A and B) that my custom list type can have? Well, let's look at them one at a time. If a list is of type A, then that represents an empty list, so I need to decide what the sum of an empty list should be. The most sensible value for the sum of an empty list is 0, so the rule is "I the list is of type A, then the sum is 0". Now, if the list is of type B, then what does the sum of that list mean? Well, the sum of a list of type B would be its int value, plus the sum of the sublist.
So now we have a "sum" rule for each of the two types that list1 can have. If A, the sum is 0. If B, the sum is (value + sum of sublist). And that rule translates almost verbatim into F# code!
let rec sum (lst : list1) =
match lst with
| A -> 0
| B (value, sublist) -> value + sum sublist
A couple things I want to note about this code. First, one thing you may or may not have seen before (since you seem to be an F# beginner) is the rec keyword. This is required when you're writing a recursive function: due to internal details in how the F# parser is implemented, if a function is going to call itself, you have to declare that ahead of time when you declare the function's name and parameters. Second, this is not the best way to write a sum function, because it is not actually tail-recursive, which means that it might throw a StackOverflowException if you try to sum a really, really long list. At this point in your learning F# you maybe shouldn't worry about that just yet, but eventually you will learn a useful technique for turning a non-tail-recursive function into a tail-recursive one. It involves adding an extra parameter usually called an "accumulator" (and sometimes spelled acc for short), and a properly tail-recursive version of the above sum function would have looked like this:
let sum (lst : list1) =
let rec tailRecursiveSum (acc : int) (lst : list1) =
match lst with
| A -> acc
| B (value, sublist) -> tailRecursiveSum (acc + value) sublist
tailRecursiveSum 0 lst
If you're already at the point where you can understand this, great! If you're not at that point yet, bookmark this answer and come back to it once you've studied tail recursion, because this technique (turning a non-tail-recursive function into a tail-recursive one with the use of an inner function and an accumulator parameter) is a very valuable one that has all sorts of applications in F# programming.

Besides tail-recursion, generic programming may be a concept of importance for the functional learner. Why go to the trouble of creating a custom data type, if it only can hold integer values?
The sum of all elements of a list can be abstracted as the repeated application of the addition operator to all elements of the list and an accumulator primed with an initial state. This can be generalized as a functional fold:
type 'a list1 = A | B of 'a * 'a list1
let fold folder (state : 'State) list =
let rec loop s = function
| A -> s
| B(x : 'T, xs) -> loop (folder s x) xs
loop state list
// val fold :
// folder:('State -> 'T -> 'State) -> state:'State -> list:'T list1 -> 'State
B(1, B(2, B(3, B(4, A))))
|> fold (+) 0
// val it : int = 10
Making also the sum function generic needs a little black magic called statically resolved type parameters. The signature isn't pretty, it essentially tells you that it expects the (+) operator on a type to successfully compile.
let inline sum xs = fold (+) Unchecked.defaultof<_> xs
// val inline sum :
// xs: ^a list1 -> ^b
// when ( ^b or ^a) : (static member ( + ) : ^b * ^a -> ^b)
B(1, B(2, B(3, B(4, A))))
|> sum
// val it : int = 10

Related

F#: What to call a combination of map and fold, or of map and reduce?

A simple example, inspired by this question:
module SimpleExample =
let fooFold projection folder state source =
source |> List.map projection |> List.fold folder state
// val fooFold :
// projection:('a -> 'b) ->
// folder:('c -> 'b -> 'c) -> state:'c -> source:'a list -> 'c
let fooReduce projection reducer source =
source |> List.map projection |> List.reduce reducer
// val fooReduce :
// projection:('a -> 'b) -> reducer:('b -> 'b -> 'b) -> source:'a list -> 'b
let game = [0, 5; 10, 15]
let minX, maxX = fooReduce fst min game, fooReduce fst max game
let minY, maxY = fooReduce snd min game, fooReduce snd max game
What would be a natural name for the functions fooFold and fooReduce in this example? Alas, mapFold and mapReduce are already taken.
mapFold is part of the F# library and does a fold operation over the input to return a tuple of 'result list * 'state, similar to scan, but without the initial state and the need to provide the tuple as part of the state yourself. Its signature is:
val mapFold : ('State -> 'T -> 'Result * 'State) -> 'State -> 'T list
-> 'Result list * 'State
Since the projection can easily be integrated into the folder, the fooFold function is only included for illustration purposes.
And MapReduce:
MapReduce is an algorithm for processing huge datasets on certain
kinds of distributable problems using a large number of nodes
Now for a more complex example, where the fold/reduce is not directly applied to the input, but to the groupings following a selection of the keys.
The example has been borrowed from a Python library, where it is called - perhaps misleadingly - reduceby.
module ComplexExample =
let fooFold keySelection folder state source =
source |> Seq.groupBy keySelection
|> Seq.map (fun (k, xs) ->
k, Seq.fold folder state xs)
// val fooFold :
// keySelection:('a -> 'b) ->
// folder:('c -> 'a -> 'c) -> state:'c -> source:seq<'a> -> seq<'b * 'c>
// when 'b : equality
let fooReduce keySelection projection reducer source =
source |> Seq.groupBy keySelection
|> Seq.map (fun (k, xs) ->
k, xs |> Seq.map projection |> Seq.reduce reducer)
// val fooReduce :
// keySelection:('a -> 'b) ->
// projection:('a -> 'c) ->
// reducer:('c -> 'c -> 'c) -> source:seq<'a> -> seq<'b * 'c>
// when 'b : equality
type Project = { name : string; state : string; cost : decimal }
let projects =
[ { name = "build roads"; state = "CA"; cost = 1000000M }
{ name = "fight crime"; state = "IL"; cost = 100000M }
{ name = "help farmers"; state = "IL"; cost = 2000000M }
{ name = "help farmers"; state = "CA"; cost = 200000M } ]
fooFold (fun x -> x.state) (fun acc x -> acc + x.cost) 0M projects
// val it : seq<string * decimal> = seq [("CA", 1200000M); ("IL", 2100000M)]
fooReduce (fun x -> x.state) (fun x -> x.cost) (+) projects
// val it : seq<string * decimal> = seq [("CA", 1200000M); ("IL", 2100000M)]
What would be the natural name for the functions fooFold and fooReduce here?
I'd probably call the first two mapAndFold and mapAndReduce (though I agree that mapFold and mapReduce would be good names if they were not already taken). Alternatively, I'd go with mapThenFold (etc.), which is perhaps more explicit, but it reads a bit cumbersome.
For the more complex ones, reduceBy and foldBy sound good. The issue is that this would not work if you also wanted a version of those functions that do not do the mapping operation. If you wanted that, you'd probably need mapAndFoldBy and mapAndReduceBy (as well as just foldBy and reduceBy). This gets a bit ugly, but I'm afraid that's the best you can do.
More generally, the issue when comparing names with Python is that Python allows overloading whereas F# functions do not. This means that you need to have a unique name for functions that would have multiple overloads. This means that you just need to come up with a consistent naming scheme that will not make the names unbearably long.
(I experienced this when coming up with names for the functions in the Deedle library, which is somewhat inspired by Pandas. You can see for example the aggregation functions in Deedle for an example - there is a pattern in the naming to deal with the fact that each function needs a unique name.)
I have a different opinion as Thomas.
First; I think that not having overloads is a good thing, and giving every operation unique names is also
something good. I also would say that giving long names to functions rarely used is even more important
and should not be avoided.
Writing longer names is usally never a problem as we as programers usually use an IDE with auto-completion.
But reading and understanding is different. Knowing what a functions does because of a long descriptive name
is better then a short name.
A long descriptive function name gets more important the less often a function is used. It helps reading and
understanding the code. A short and less descriptive function name that is rarely used causes confusion. The
confusion would just increase if it even would be just an overload of another function name.
Yes; naming things can be hard, that's the reason why its important and shoudn't be avoided.
To what you describe. I would have name it mapFold and mapReduce. As those exactly describe what they do.
There is already a mapFold in F#, and in my opinion, the F# devs fucked up either with the naming, arguments or the
output of the function. But anyhow, they just fucked up.
I usually would have expected mapFold to do map and then fold. Actually it does, but it also returns the intermediate
list that is created on the run. Something I would not expect it to return. And i would also expect it to pass two
functions instead of one.
When we get to Thomas suggestion on naming it mapAndFold or mapThenFold. Then i would expect different behaviour
for those two functions. mapThenFold exactly tells what it does. map and then fold on it. I think the then is
not important. That's also why I would name it mapFold or mapReduce. Writing it this way already suggest a then.
But mapAndFold or mapAndReduce does not tell something about the order of execution. It just says it does two things
or somehow returns this AND that.
With that in mind, i would say that the F# library should have named its mapFold either mapAndFold, changed the return
value to just return the fold (and have two arguments instead of one). But hey, its fucked up now, we cannot change it anymore.
As for mapReduce, I think you are a little bit mistaken. The mapReduce algorithm is named that way, because it just does
map and then reduce. And that's it.
But functional programming with its stateless and more descriptive operations sometimes have additional benefits. Technically
a map is less powerful compared to a for/fold as it just describes how values are changed, without that the order matters
or the position in a list. But because of this limitation, you can run it in parallel, even on a big computer cluster. And that's all
what mapReduce Algorithm you cite do.
But that doesn't mean a mapReduce must always run its operation on a big cluster or in parallel. In my opinion you could
just name it mapReduce and that's fine. Everybody will know what it does and I think nobody expect it to suddenly run on
cluster.
In general I think the mapFold that F# provides is silly, here are 4 examples how I think it should have been provided.
let double x = x * 2
let add x y = x + y
mapFold double add 0 [1..10] // 110
mapAndFold double add 0 [1..10] // [2;4;6;8;10;12;14;16;18;20] * 110
mapReduce double add [1..10] // Some (110)
mapAndReduce double add [1..10] // Some ([2;4;6;8;10;12;14;16;18;20] * 110)
Well mapFold doesn't work that way, so you have the following options.
Implement mapReduce the way you have it. And ignore the in-consistency with mapFold.
Provide mapAndReduce and mapReduce.
Make your mapReduce return the same crap as the default implementation of mapFold does and provide mapThenReduce.
Like (3) but also add mapThenFold.
Option 4 has the most compatibility and expectation of what already exists in F#. But that doesn't mean you must do it that way.
In my opinion I would just:
implement mapReduce returning the result of map and then reduce.
I wouldn't care about a mapAndReduce version that returns a list and the result.
Provide a mapThenFold expecting two function arguments returning the result just of fold.
As a general notice: Implementing mapReduce just by calling map and then reduce is somewhat pointless. I would
expect it to have a more low-level implementation that does both things by just traversing the data-structure once.
If not, i just can call map and then reduce anyway.
So an implementation should look like:
let mapReduce mapper reducer xs =
let rec loop state xs =
match xs with
| [] -> state
| x::xs -> loop (reducer state (mapper x)) xs
match xs with
| [] -> ValueNone
| [x] -> ValueSome (mapper x)
| x::xs -> ValueSome (loop (mapper x) xs)
let double x = x * 2
let add x y = x + y
let some110 = mapReduce double add [1..10]

Noob question about F# function parameter where the parameter is a list

I'm trying to play around with creating functions in F#, In the image below, I'm trying to create a function that takes a list of floats and sum the values in the list. I don't know how to pass a list as parameter in a function so I tried this to get the head of a list but the code doesn't work:
let sumlist l=
printf "%f" l.Head
Then I see some people does:
let sumlist l:float=
match l with
| [] -> 0.0
| e::li -> e + sumlist li
So is l:float the way you pass a list to a function? so like l:string would be a list of string?
But I saw list l has l.Head function to return the first element in the list(As it seems that we can't access arbitrary elements in the list like an array) but
let sumlist l:float=
printfn "%f" l.Head
gives type mismatch error.
I also don't understand the recursive code provided, I don't understand this line
| e::li -> e + sumlist li
What is ::? and Li?
Thank you for clarifying this for me!
So your first example doesn't return anything and that's because you're calling printfn which prints to the console instead of returning your types. e :: li here represents a list where e is the head and li is the rest of the list. The :: here lets the compiler know that you want to deconstruct the list.
//fully annotated
let s (l: float list) :float =
l.Head
//here the types can be inferred without any annotation
let rec sumlist l =
match l with
| [] -> 0.0
| e::li -> e + sumlist li
s [0.7]
//returns 0.7
sumlist [0.4;0.5;0.6]
//returns 1.5
In my first example if you try and remove the type annotations you'll notice that you get an error. This is because l.Head's type is ambiguous otherwise did you call l.Head on a list of strings, floats? In the sumlist function I provided you can see that I didn't need to annotate, and this is because I'm adding them up and that constrains the types.
Personally when starting I highly recommend always annotating the types. (l : float list) or (l: list<float>) is a way to say my input is a list of floats, and :float at the end how we say the return type is a float. You'll notice I put a rec keyword on our recursive function, it's better to explicitly declare whenever you make a recursive function.
Syntax questions
So is l:float the way you pass a list to a function?
No. Most of the time the compiler can figure out that you are passing a list without annotating the parameter as a list, but when it doesn't, you annotate is
l : 'a list // where 'a is generic type
// OR
l : float list // where type is specified as float
What is ::? and Li?
When pattern matching a list, [] matches to empty list, which here is used as the recursion end criteria. The other match separates head (e) from the rest of the list aka tail (li). If there is only one item in list, then li evaluates as [].
Additional note for your recursive code: You are missing the recursion keyword rec eg.
let rec sumlist ...
Recursive function implementation
The easiest way would be to use the sum function of List eg.
[0.4; 0.5; 0.6] |> List.sum // Returns 1.5
But, if you want to create this function yourself, consider using tail-recursion for better performance and to avoid stack overflow with bigger input lists.
let sumlist (values : float list) =
let rec sum (acc : float) (remaining : float list) =
match remaining with
| [] -> acc
| head :: tail -> sum (acc + head) tail
sum 0. values
Which is called
[0.4; 0.5; 0.6] |> sumlist // Returns 1.5
The difference here to a normal recursion is that each recursion calculates its own values and is not dependent on other recursions yet to come to finish its calculations.

How to properly create and use polynomial type and term type in f#

I'm trying to do this exercise:
I'm not sure how to use Type in F#, in F# interactive, I wrote type term = Term of float *int, Then I tried to create a value of type term by let x: term = (3.5,8);;But it gives an error.
Then I tried let x: term = Term (3.5,8);; and it worked. So Why is that?
For the first function, I tried:
let multiplyPolyByTerm (x:term, p:poly)=
match p with
|[]->[]
But that gives an error on the line |[]->[] saying that the expression is expecting a type poly, but poly is a in fact a list right? So why is it wrong here? I fixed it by |Poly[]->Poly[]. Then I tried to finish the function by giving the recursive definition of multiplying each term of the polynomial by the given term: |Poly a::af-> This gives an error so I'm stuck on trying to break down the Poly list.
If anyone has suggestion on good readings about Type in F#, please share it.
I got all the methods now, However,I find myself unable to throw an exception when the polynomial is an empty list as the base case of my recursive function is an empty list. Also, I don't know how to group common term together, Please help, Here are my codes:
type poly=Poly of (float*int) list
type term = Term of float *int
exception EmptyList
(*
let rec mergeCommonTerm(p:poly)=
let rec iterator ((a: float,b: int ), k: (float*int) list)=
match k with
|[]->(a,b)
|ki::kf-> if b= snd ki then (a+ fst ki,b)
match p with
|Poly [] -> Poly []
|Poly (a::af)-> match af with
|[]-> Poly [a]
|b::bf -> if snd a =snd b then Poly (fst a +fst b,snd a)::bf
else
*)
let rec multiplyPolyByTerm (x:term, p:poly)=
match x with
| Term (coe,deg) -> match p with
|Poly[] -> Poly []
|Poly (a::af) -> match multiplyPolyByTerm (x,Poly af) with
|Poly recusivep-> Poly ((fst a *coe,snd a + deg)::recusivep)
let rec addTermToPoly (x:term, p:poly)=
match x with
|Term (coe, deg)-> match p with
|Poly[] -> Poly [(coe,deg)]
|Poly (a::af)-> if snd a=deg then Poly ((fst a+coe,deg)::af)
else match addTermToPoly (x,Poly af) with
|Poly recusivep-> Poly (a::recusivep)
let rec addPolys (x:poly, y: poly)=
match x with
|Poly []->y
|Poly (xh::xt)-> addPolys(Poly xt,addTermToPoly(Term xh, y))
let rec multPolys (x:poly,y:poly)=
match x with
|Poly []-> Poly[]
|Poly (xh::xt)->addPolys (multiplyPolyByTerm(Term xh,y),multPolys(Poly xt,y))
let evalTerm (values:float) (termmm : term) :float=
match termmm with
|Term (coe,deg)->coe*(values**float(deg))
let rec evalPoly (polyn : poly, v: float) :float=
match polyn with
|Poly []->0.0
|Poly (ph::pt)-> (evalTerm v (Term ph)) + evalPoly (Poly pt,v)
let rec diffPoly (p:poly) :poly=
match p with
|Poly []->Poly []
|Poly (ah::at)-> match diffPoly (Poly at) with
|Poly [] -> if snd ah = 0 then Poly []
else Poly [(float(snd ah)*fst ah,snd ah - 1)]
|Poly (bh::bt)->Poly ((float(snd ah)*fst ah,snd ah - 1)::bh::bt)
As I mentioned in a comment, reading https://fsharpforfunandprofit.com/posts/discriminated-unions/ will be very helpful for you. But let me give you some quick help to get you unstuck and starting to solve your immediate problems. You're on the right track, you're just struggling a little with the syntax (and operator precedence, which is part of the syntax).
First, load the MSDN operator precedence documentation in another tab while you read the rest of this answer. You'll want to look at it later on, but first I'll explain a subtlety of how F# treats discriminated unions that you probably haven't understood yet.
When you define a discriminated union type like poly, the name Poly acts like a constructor for the type. In F#, constructors are functions. So when you write Poly (something), the F# parser interprets this as "take the value (something) and pass it to the function named Poly". Here, the function Poly isn't one you had to define explicitly; it was implicitly defined as part of your type definition. To really make this clear, consider this example:
type Example =
| Number of int
| Text of string
5 // This has type int
Number 5 // This has type Example
Number // This has type (int -> Example), i.e. a function
"foo" // This has type string
Text "foo" // This has type Example
Text // This has type (string -> Example), i.e. a function
Now look at the operator precedence list that you loaded in another tab. Lowest precedence is at the top of the table, and highest precedence is at the bottom; in other words, the lower something is on the table, the more "tightly" it binds. As you can see, function application (f x, calling f with parameter x) binds very tightly, more tightly than the :: operator. So when you write f a::b, that is not read as f (a::b), but rather as (f a)::b. In other words, f a::b reads as "Item b is a list of some type which we'll call T, and the function call f a produces an item of type T that should go in front of list b". If you instead meant "take the list formed by putting item a at the head of list b, and then call f with the resulting list", then that needs parentheses: you have to write f (a::b) to get that meaning.
So when you write Poly a::af, that's interpreted as (Poly a)::af, which means "Here is a list. The first item is a Poly a, which means that a is a (float * int) list. The rest of the list will be called af". And since the value your passing into it is not a list, but rather a poly type, that is a type mismatch. (Note that items of type poly contain lists, but they are not themselves lists). What you needed to write was Poly (a::af), which would have meant "Here is an item of type poly that contains a list. That list should be split into the head, a, and the rest, af."
I hope that helped rather than muddle the waters further. If you didn't understand any part of this, let me know and I'll try to make it clearer.
P.S. Another point of syntax you might want to know: F# gives you many ways to signal an error condition (like an empty list in this assignment), but your professor has asked you to use exception EmptyList when invalid input is given. That means he expects your code to "throw" or "raise" an exception when you encounter an error. In C# the term is "throw", but in F# the term is "raise", and the syntax looks like this:
if someErrorCondition then
raise EmptyList
// Or ...
match listThatShouldNotBeEmpty with
| [] -> raise EmptyList
| head::rest -> // Do something with head, etc.
That should take care of the next question you would have needed to ask. :-)
Update 2: You've edited your question to clarify another issue you're having, where your recursive function boils down to an empty list as the base case — yet your professor asked you to consider an empty list as an invalid input. There are two ways to solve this. I'll discuss the more complicated one first, then I'll discuss the easier one.
The more complicated way to solve this is to have two separate functions, an "outer" one and an "inner" one, for each of the functions you have been asked to define. In each case, the "outer" one checks whether the input is an empty list and throws an exception if that's the case. If the input is not an empty list, then it passes the input to the "inner" function, which does the recursive algorithm (and does NOT consider an empty list to be an error). So the "outer" function is basically only doing error-checking, and the "inner" function is doing all the work. This is a VERY common approach in professional programming, where all your error-checking is done at the "edges" of your code, while the "inner" code never has to deal with errors. It's therefore a good approach to know about — but in your particular case, I think it's more complicated than you need.
The easier solution is to rewrite your functions to consider a single-item list as the base case, so that your recursive functions never go all the way to an empty list. Then you can always consider an empty list to be an error. Since this is homework I won't give you an example based on your actual code, but rather an example based on a simple "take the sum of a list of integers" exercise where an empty list would be considered an error:
let rec sumNonEmptyList (input : int list) : int =
match input with
| [] -> raise EmptyList
| [x] -> x
| x::rest -> x + sumNonEmptyList rest
The syntax [x] in a match expression means "This matches a list with exactly one item in it, and assigns the name x to the value of that item". In your case, you'd probably be matching against Poly [] to raise an exception, Poly [a] as the base case, and Poly (a::af) as the "more than one item" case. (That's as much of a clue as I think I should give you; you'll learn better if you work out the rest yourself).

F# type mismatch while calling function

This code
let rec readNLines n list =
if n = 0 then
list
else
readNLines(n-1,readInt()::list)
ends with
Type mismatch. Expecting a 'a but given a 'a -> 'a
The resulting type would be infinite when unifying ''a' and
''a -> 'a' (using built-in F# compiler)
but runs ok when last line is changed to
readNLines(n-1,(readInt()::list))
or
readNLines(n-1)(readInt()::list)
Question is: Why? :|
Only the last version can work, because readNLines takes two arguments, but
readNLines (n - 1, readInt() :: list)
passes only one argument (which is a tuple consisting of an int and the list).
readNLines (n - 1) (readInt() :: list)
passes them as two separate arguments - the difference here is using the comma (tuple) and space (two arguments).
By the way, that becomes much clearer when you use more whitespace (as I did), because the individual elements are easier to identify.
Take a look at these two functions:
> let f1 a b = a + b
val f1 : a:int -> b:int -> int
> let f2 (a, b) = a + b
val f2 : a:int * b:int -> int
As you can see, they have slightly different types. In function f1 you partially apply the arguments (you'll see the term 'curried function' used here), in function f2 you pass in a tuple of arguments in one "go", or you can think of it as only ever having a single argument (an 'uncurried' function).
What you're doing is defining a function f1 style, but later calling it f2 style, which confuses the compiler.

cons operator (::) in F#

The :: operator in F# always prepends elements to the list. Is there an operator that appends to the list? I'm guessing that using # operator
[1; 2; 3] # [4]
would be less efficient, than appending one element.
As others said, there is no such operator, because it wouldn't make much sense. I actually think that this is a good thing, because it makes it easier to realize that the operation will not be efficient. In practice, you shouldn't need the operator - there is usually a better way to write the same thing.
Typical scenario: I think that the typical scenario where you could think that you need to append elements to the end is so common that it may be useful to describe it.
Adding elements to the end seems necessary when you're writing a tail-recursive version of a function using the accumulator parameter. For example a (inefficient) implementation of filter function for lists would look like this:
let filter f l =
let rec filterUtil acc l =
match l with
| [] -> acc
| x::xs when f x -> filterUtil (acc # [x]) xs
| x::xs -> filterUtil acc xs
filterUtil [] l
In each step, we need to append one element to the accumulator (which stores elements to be returned as the result). This code can be easily modified to use the :: operator instead of appending elements to the end of the acc list:
let filter f l =
let rec filterUtil acc l =
match l with
| [] -> List.rev acc // (1)
| x::xs when f x -> filterUtil (x::acc) xs // (2)
| x::xs -> filterUtil acc xs
filterUtil [] l
In (2), we're now adding elements to the front of the accumulator and when the function is about to return the result, we reverse the list (1), which is a lot more efficient than appending elements one by one.
Lists in F# are singly-linked and immutable. This means consing onto the front is O(1) (create an element and have it point to an existing list), whereas snocing onto the back is O(N) (as the entire list must be replicated; you can't change the existing final pointer, you must create a whole new list).
If you do need to "append one element to the back", then e.g.
l # [42]
is the way to do it, but this is a code smell.
The cost of appending two standard lists is proportional to the length of the list on the left. In particular, the cost of
xs # [x]
is proportional to the length of xs—it is not a constant cost.
If you want a list-like abstraction with a constant-time append, you can use John Hughes's function representation, which I'll call hlist. I'll try to use OCaml syntax, which I hope is close enough to F#:
type 'a hlist = 'a list -> 'a list (* a John Hughes list *)
let empty : 'a hlist = let id xs = xs in id
let append xs ys = fun tail -> xs (ys tail)
let singleton x = fun tail -> x :: tail
let cons x xs = append (singleton x) xs
let snoc xs x = append xs (singleton x)
let to_list : 'a hlist -> 'a list = fun xs -> xs []
The idea is that you represent a list functionally as a function from "the rest of the elements" to "the final list". This works great if you are going to build up the whole list before you look at any of the elements. Otherwise you'll have to deal with the linear cost of append or use another data structure entirely.
I'm guessing that using # operator [...] would be less efficient, than appending one element.
If it is, it will be a negligible difference. Both appending a single item and concatenating a list to the end are O(n) operations. As a matter of fact I can't think of a single thing that # has to do, which a single-item append function wouldn't.
Maybe you want to use another data structure. We have double-ended queues (or short "Deques") in fsharpx. You can read more about them at http://jackfoxy.com/double-ended-queues-for-fsharp
The efficiency (or lack of) comes from iterating through the list to find the final element. So declaring a new list with [4] is going to be negligible for all but the most trivial scenarios.
Try using a double-ended queue instead of list. I recently added 4 versions of deques (Okasaki's spelling) to FSharpx.Core (Available through NuGet. Source code at FSharpx.Core.Datastructures). See my article about using dequeus Double-ended queues for F#
I've suggested to the F# team the cons operator, ::, and the active pattern discriminator be made available for other data structures with a head/tail signature.3

Resources