F# Sequences, Lists and Arrays - f#

Suppose I have a list of elements of type 'a, i.e.;
let mylist: 'a list = ...
and a function f of type 'a -> 'b;
let f: 'a -> 'b = ...
Now I want to use f transform mylist into a 'b array.
Is the following:
mylist |> List.map f |> Array.ofList
improved upon performance-wise and memory-wise by the following:
mylist |> List.toSeq |> Seq.map f |> Array.ofSeq
?

The answer depends on your use-case, ìe how big is myList? However, in general Seq in F# is slow and wasteful with memory. Luckily, for the most time that doesn't matter as we usually only have to avoid "truly awful performance" while "mediocre performance" is usually good enough.
I haven't measured it (shame) but from my experience in most cases this would perform the better than the examples by OP
mylist |> Array.ofList |> Array.map f
Convert from list to Array ASAP and then map the Array (Arrays are for sequential access one of the more efficient structures in .NET).
In some rare scenarios where storing an intermediate array value would cause memory to be swapped to disk I suppose using Seq could help, however in those scenarios why would we be using List<_> as that is wasteful with memory compared to Array<_> (unless we can make use of tail reuse)?
There are libraries in .NET that has Seq like properties but still provides decent performance like: https://github.com/nessos/Streams
Church-encoded lists or transducers are also quite performant while avoiding intermediate array values.

Related

Different argument order for getting N-th element of Array, List or Seq

Is there a good reason for a different argument order in functions getting N-th element of Array, List or Seq:
Array.get source index
List .nth source index
Seq .nth index source
I would like to use pipe operator and it seems possible only with Seq:
s |> Seq.nth n
Is there a way to have the same notation with Array or List?
I don't think of any good reason to define Array.get and List.nth this way. Given that pipeplining is very common in F#, they should have been defined so that the source argument came last.
In case of List.nth, it doesn't change much because you can use Seq.nth and time complexity is still O(n) where n is length of the list:
[1..100] |> Seq.nth 10
It's not a good idea to use Seq.nth on arrays because you lose random access. To keep O(1) running time of Array.get, you can define:
[<RequireQualifiedAccess>]
module Array =
/// Get n-th element of an array in O(1) running time
let inline nth index source = Array.get source index
In general, different argument order can be alleviated by using flip function:
let inline flip f x y = f y x
You can use it directly on the functions above:
[1..100] |> flip List.nth 10
[|1..100|] |> flip Array.get 10
    
Just use backward pipe operator:
[1..1000] |> List.nth <| 42
Since both operators are left associative, x |> f <| y is parsed as (x |> f) <| y, and this does the trick.
Backward pipe operator is also useful if you want to remove parentheses: f (very long expression) can be replaced with f <| very long expression.
Since Pad and bytebuster answered your last question I will focus on the why part.
This is based my current knowledge and not historical facts.
Since F# derived from OCaml and OCaml has Array and List but not Seq and F# uses |> for natural pipelining and type checking and OCaml lacks the pipleline operator, the authors of F# made the switch for Seq. But obviously to be backward compatablie with OCaml they did not switch everything.

Seq.iter vs for - what difference?

I can do
for event in linq.Deltas do
or I can do
linq.Deltas |> Seq.iter(fun event ->
So I'm not sure if that is the same. If that is not the same I want to know the difference. I don't know what to use: iter or for.
added - so if that is the matter of choice I prefer to use iter on a top level and for is for closures
added some later - looking like iter is map + ignore - it's the way to run from using imperative ignore word. So it's looking like functional way ...
As others mentioned, there are some differences (iter supports non-generic IEnumerator and you can mutate mutable values in for). These are sometimes important differences, but most of the times you can freely choose which one to use.
I generally prefer for (if there is a language construct, why not use it?). The cases where iter looks nicer are when you have a function that you need to call (e.g. using partial application):
// I would write this:
strings |> Seq.iter (printfn "%c")
// instead of:
for s in strings do printfn "%c" s
Similarly, using iter is nicer if you use it at the end of some processing pipeline:
// I would write this:
inputs |> Seq.filter (fun x -> x > 0)
|> Seq.iter (fun x -> foo x)
// instead of:
let filtered = inputs |> Seq.filter (fun x -> x > 0)
for x in filtered do foo x
You can modify mutable variables from the body of a for loop. You can't do that from a closure, which implies you can't do that using iter. (Note: I'm talking about mutable variables declared outside of the for / iter. Local mutable variables are accessible.)
Considering that the point of iter is to perform some side effect, the difference can be important.
I personally seldom use iter, as I find for to be clearer.
For most of the situations, they are the same. I would prefer the first use. It looks clear to me.
The difference is that for in loop support IEnumerable objects, while Seq.iter requires that your collection (linq.deltas) is IEnumerable<T>.
E.g. MatchCollection class in .net regular expression inherits IEnumerable not IEnumerable<T>, you cannot use Seq.map or Seq.iter directly on it. But you can use for in loop.
It is the style of programming. Imperative vs using functional programming. Keep in mind that F# is not a pure functional programming language.
Generally, use Seq.Iter if it is a part of some large pipeline processing, as that makes it much more clearer, but for ordinary case I think the imperative way is clearer. Sometime it is a personal preference, sometimes it is other issues like performance.
for in F# is a form of list comprehension - bread and butter of functional programming while Seq.iter is a 'for side-effects only' imperative construct - not a sign of a functional code. Here what you can do with for:
let pairsTo n = seq {
for i in [1..n] do
for j in [i..n] do
if j%i <> 0 then yield (i,j) }
printf "%A" (pairsTo 10 |> Seq.toList)

How do you work with IList<> in F#?

I have a list of type IList<Effort>. The model Effort contains a float called Amount. I would like to return the sum of Amount for the whole list, in F#. How would this be achieved?
efforts |> Seq.sumBy (fun e -> e.Amount)
Upvoted the answers of Seq.fold, pipelined Seq.fold, and pipelined Seq.sumBy (I like the third one best).
That said, no one has mentioned that seq<'T> is F#'s name for IEnumerable<T>, and so the Seq module functions work on any IEnumerable, including ILists.
Seq.fold (fun acc (effort: Effort) -> acc + effort.Amount) 0.0 efforts
One detail that may be interesting is that you can also avoid using type annotations. In the code by sepp2k, you need to specify that the effort value has a type Effort, because the compiler is processing code from the left to the right (and it would fail on the call effort.Amount if it didn't know the type). You can write this using pipelining operator:
efforts |> Seq.fold (fun acc effort -> acc + effort.Amount) 0.0
Now, the compiler knows the type of effort because it knows that it is processing a collection efforts of type IList<Effort>. It's a minor improvement, but I think it's quite nice.

cons operator (::) in F#

The :: operator in F# always prepends elements to the list. Is there an operator that appends to the list? I'm guessing that using # operator
[1; 2; 3] # [4]
would be less efficient, than appending one element.
As others said, there is no such operator, because it wouldn't make much sense. I actually think that this is a good thing, because it makes it easier to realize that the operation will not be efficient. In practice, you shouldn't need the operator - there is usually a better way to write the same thing.
Typical scenario: I think that the typical scenario where you could think that you need to append elements to the end is so common that it may be useful to describe it.
Adding elements to the end seems necessary when you're writing a tail-recursive version of a function using the accumulator parameter. For example a (inefficient) implementation of filter function for lists would look like this:
let filter f l =
let rec filterUtil acc l =
match l with
| [] -> acc
| x::xs when f x -> filterUtil (acc # [x]) xs
| x::xs -> filterUtil acc xs
filterUtil [] l
In each step, we need to append one element to the accumulator (which stores elements to be returned as the result). This code can be easily modified to use the :: operator instead of appending elements to the end of the acc list:
let filter f l =
let rec filterUtil acc l =
match l with
| [] -> List.rev acc // (1)
| x::xs when f x -> filterUtil (x::acc) xs // (2)
| x::xs -> filterUtil acc xs
filterUtil [] l
In (2), we're now adding elements to the front of the accumulator and when the function is about to return the result, we reverse the list (1), which is a lot more efficient than appending elements one by one.
Lists in F# are singly-linked and immutable. This means consing onto the front is O(1) (create an element and have it point to an existing list), whereas snocing onto the back is O(N) (as the entire list must be replicated; you can't change the existing final pointer, you must create a whole new list).
If you do need to "append one element to the back", then e.g.
l # [42]
is the way to do it, but this is a code smell.
The cost of appending two standard lists is proportional to the length of the list on the left. In particular, the cost of
xs # [x]
is proportional to the length of xs—it is not a constant cost.
If you want a list-like abstraction with a constant-time append, you can use John Hughes's function representation, which I'll call hlist. I'll try to use OCaml syntax, which I hope is close enough to F#:
type 'a hlist = 'a list -> 'a list (* a John Hughes list *)
let empty : 'a hlist = let id xs = xs in id
let append xs ys = fun tail -> xs (ys tail)
let singleton x = fun tail -> x :: tail
let cons x xs = append (singleton x) xs
let snoc xs x = append xs (singleton x)
let to_list : 'a hlist -> 'a list = fun xs -> xs []
The idea is that you represent a list functionally as a function from "the rest of the elements" to "the final list". This works great if you are going to build up the whole list before you look at any of the elements. Otherwise you'll have to deal with the linear cost of append or use another data structure entirely.
I'm guessing that using # operator [...] would be less efficient, than appending one element.
If it is, it will be a negligible difference. Both appending a single item and concatenating a list to the end are O(n) operations. As a matter of fact I can't think of a single thing that # has to do, which a single-item append function wouldn't.
Maybe you want to use another data structure. We have double-ended queues (or short "Deques") in fsharpx. You can read more about them at http://jackfoxy.com/double-ended-queues-for-fsharp
The efficiency (or lack of) comes from iterating through the list to find the final element. So declaring a new list with [4] is going to be negligible for all but the most trivial scenarios.
Try using a double-ended queue instead of list. I recently added 4 versions of deques (Okasaki's spelling) to FSharpx.Core (Available through NuGet. Source code at FSharpx.Core.Datastructures). See my article about using dequeus Double-ended queues for F#
I've suggested to the F# team the cons operator, ::, and the active pattern discriminator be made available for other data structures with a head/tail signature.3

While or Tail Recursion in F#, what to use when?

Ok, only just in F# and this is how I understand it now :
Some problems are recursive in nature (building or reading out a treestructure to name just one) and then you use recursion. In these cases you preferably use tail-recursion to give the stack a break
Some languagues are pure functional, so you have to use recursion in stead of while-loops, even if the problem is not recursive in nature
So my question : since F# also support the imperative paradigm, would you use tail recursion in F# for problems that aren't naturally recursive ones? Especially since I have read the compiler recongnizes tail recursion and just transforms it in a while loop anyway?
If so : why ?
The best answer is 'neither'. :)
There's some ugliness associated with both while loops and tail recursion.
While loops require mutability and effects, and though I have nothing against using these in moderation, especially when encapsulated in the context of a local function, you do sometimes feel like you're cluttering/uglifying your program when you start introducing effects merely to loop.
Tail recursion often has the disadvantage of requiring an extra accumulator parameter or continuation-passing style. This clutters the program with extra boilerplate to massage the startup conditions of the function.
The best answer is to use neither while loops nor recursion. Higher-order functions and lambdas are your saviors here, especially maps and folds. Why fool around with messy control structures for looping when you can encapsulate those in reusable libraries and then just state the essence of your computation simply and declaratively?
If you get in the habit of often calling map/fold rather than using loops/recursion, as well as providing a fold function along with any new tree-structured data type you introduce, you'll go far. :)
For those interested in learning more about folds in F#, why not check out my first three blog posts in a series on the topic?
In order of preference and general programming style, I will write code as follows:
Map/fold if its available
let x = [1 .. 10] |> List.map ((*) 2)
Its just convenient and easy to use.
Non-tail recursive function
> let rec map f = function
| x::xs -> f x::map f xs
| [] -> [];;
val map : ('a -> 'b) -> 'a list -> 'b list
> [1 .. 10] |> map ((*) 2);;
val it : int list = [2; 4; 6; 8; 10; 12; 14; 16; 18; 20]
Most algorithms are easiest to read and express without tail-recursion. This works particularly well when you don't need to recurse too deeply, making it suitable for many sorting algorithms and most operations on balanced data structures.
Remember, log2(1,000,000,000,000,000) ~= 50, so log(n) operation without tail-recursion isn't scary at all.
Tail-recursive with accumulator
> let rev l =
let rec loop acc = function
| [] -> acc
| x::xs -> loop (x::acc) xs
loop [] l
let map f l =
let rec loop acc = function
| [] -> rev acc
| x::xs -> loop (f x::acc) xs
loop [] l;;
val rev : 'a list -> 'a list
val map : ('a -> 'b) -> 'a list -> 'b list
> [1 .. 10] |> map ((*) 2);;
val it : int list = [2; 4; 6; 8; 10; 12; 14; 16; 18; 20]
It works, but the code is clumsy and elegance of the algorithm is slightly obscured. The example above isn't too bad to read, but once you get into tree-like data structures, it really starts to become a nightmare.
Tail-recursive with continuation passing
> let rec map cont f = function
| [] -> cont []
| x::xs -> map (fun l -> cont <| f x::l) f xs;;
val map : ('a list -> 'b) -> ('c -> 'a) -> 'c list -> 'b
> [1 .. 10] |> map id ((*) 2);;
val it : int list = [2; 4; 6; 8; 10; 12; 14; 16; 18; 20]
Whenever I see code like this, I say to myself "now that's a neat trick!". At the cost of readability, it maintains the shape of the non-recursive function, and found it really interesting for tail-recursive inserts into binary trees.
Its probably my monad-phobia speaking here, or maybe my inherent lack of familiarity with Lisp's call/cc, but I think those occasions when CSP actually simplifies algorithms are few and far between. Counter-examples are welcome in the comments.
While loops / for loops
It occurs to me that, aside from sequence comprehensions, I've never used while or for loops in my F# code. In any case...
> let map f l =
let l' = ref l
let acc = ref []
while not <| List.isEmpty !l' do
acc := (!l' |> List.hd |> f)::!acc
l' := !l' |> List.tl
!acc |> List.rev;;
val map : ('a -> 'b) -> 'a list -> 'b list
> [1 .. 10] |> map ((*) 2);;
val it : int list = [2; 4; 6; 8; 10; 12; 14; 16; 18; 20]
Its practically a parody of imperative programming. You might be able to maintain a little sanity by declaring let mutable l' = l instead, but any non-trivial function will require the use of ref.
Honestly, any problem that you can solve with a loop is already a naturally recursive one, as you can translate both into (usually conditional) jumps in the end.
I believe you should stick with tail calls in almost all cases where you must write an explicit loop. It is just more versatile:
A while loop restricts you to one loop body, while a tail call can allow you to switch between many different states while the "loop" is running.
A while loop restricts you to one condition to check for termination, with the tail recursion you can have an arbitrarily complicated match expression waiting to shunt the control flow off somewhere else.
Your tail calls all return useful values and can produce useful type errors. A while loop does not return anything and relies on side effects to do its work.
While loops are not first class while functions with tail calls (or while loops in them) are. Recursive bindings in local scope can be inspected for their type.
A tail recursive function can easily be broken apart into parts that use tail calls to call each other in the needed sequence. This may make things easier to read, and will help if you find you want to start in the middle of a loop. This is not true of a while loop.
All in all while loops in F# are only worthwhile if you really are going to be working with mutable state, inside a function body, doing the same thing repeatedly until a certain condition is met. If the loop is generally useful or very complicated, you may want to factor it out into some other top level binding. If the data types are themselves immutable (a lot of .NET value types are), you may gain very little from using mutable references to them anyway.
I'd say that you should only resort to while loops for niche cases where a while loop is perfect for the job, and is relatively short. In many imperative programming languages, while loops are often twisted into unnatural roles like driving stuff repeatedly over a case statement. Avoid those kinds of things, and see if you can use tail calls or, even better, a higher order function, to achieve the same ends.
Many problems have a recursive nature, but having thought imperatively for a long time often prevents us from seeing this.
In general I would use a functional technique wherever possible in a functional language - Loops are never functional since they exclusively rely on side-effects. So when dealing with imperative code or algorithms, using loops is adequate, but in functional context, they're aren't considered very nice.
Functional technique doesn't only mean recursion but also using appropriate higher-order functions.
So when summing a list, neither a for-loop nor a recursive function but a fold is the solution for having comprehensible code without reinventing the wheel.
for problems that aren't naturally recursive ones
..
just transforms it in a while loop anyway
You answered this yourself.
Use recursion for recursive problems and loop for things that aren't functional in nature.
Just always think: Which feels more natural, which is more readable.

Resources