Re-implementing List.map in OCaml/F# with correct side effect order? - f#

According to this previous answer
You could implement List.map like this:
let rec map project = function
| [] -> []
| head :: tail ->
project head :: map project tail ;;
but instead, it is implemented like this:
let rec map project = function
| [] -> []
| head :: tail ->
let result = project head in
result :: map project tail ;;
They say that it is done this way to make sure the projection function is called in the expected order in case it has side effects, e.g.
map print_int [1;2;3] ;;
should print 123, but the first implementation would print 321. However, when I test both of them myself in OCaml and F#, they produce exactly the same 123 result.
(Note that I am testing this in the OCaml and F# REPLs--Nick in the comments suggests this might be the cause of my inability to reproduce, but why?)
What am I misunderstanding? Can someone elaborate why they should produce different orders and how I can reproduce? This runs contrary to my previous understanding of OCaml code I've written in the past so this was surprising to me and I want to make sure not to repeat the mistake. When I read the two, I read it as exactly the same thing with an extraneous intermediary binding.
My only guess is that the order of expression evaluation using cons is right to left, but that seems very odd?
This is being done purely as research to better understand how OCaml executes code, I don't really need to create my own List.map for production code.

The point is that the order of function application in OCaml is unspecified, not that it will be in some specific undesired order.
When evaluating this expression:
project head :: map project tail
OCaml is allowed to evaluate project head first or it can evaluate map project tail first. Which one it chooses to do is unspecified. (In theory it would probably be admissible for the order to be different for different calls.) Since you want a specified order, you need to use the form with let.
The fact that the order is unspecified is documented in Section 6.7 of the OCaml manual. See the section Function application:
The order in which the expressions expr, argument1, …, argumentn are evaluated is not specified.
(The claim that the evaluation order is unspecified isn't something you can test. No number of cases of a particular order prove that that order is always going to be chosen.)

So when you have an implementation of map like this:
let rec map f = function
| [] -> []
| a::l -> f a :: map f l
none of the function applications (f a) within the map calls are guaranteed to be evaluated sequentially in the order you'd expect. So when you try this:
map print_int [1;2;3]
you get the output
321- : unit list = [(); (); ()]
since by the time those function applications weren't executed in a specific order.
Now when you implement the map like this:
let rec map f = function
| [] -> []
| a::l -> let r = f a in r :: map f l
you're forcing the function applications to be executed in the order you're expecting because you explicitly make a call to evaluate let r = f a.
So now when you try:
map print_int [1;2;3]
you will get
123- : unit list = [(); (); ()]
because you've explicitly made an effort to evaluate the function applications in order.

Related

How to properly create and use polynomial type and term type in f#

I'm trying to do this exercise:
I'm not sure how to use Type in F#, in F# interactive, I wrote type term = Term of float *int, Then I tried to create a value of type term by let x: term = (3.5,8);;But it gives an error.
Then I tried let x: term = Term (3.5,8);; and it worked. So Why is that?
For the first function, I tried:
let multiplyPolyByTerm (x:term, p:poly)=
match p with
|[]->[]
But that gives an error on the line |[]->[] saying that the expression is expecting a type poly, but poly is a in fact a list right? So why is it wrong here? I fixed it by |Poly[]->Poly[]. Then I tried to finish the function by giving the recursive definition of multiplying each term of the polynomial by the given term: |Poly a::af-> This gives an error so I'm stuck on trying to break down the Poly list.
If anyone has suggestion on good readings about Type in F#, please share it.
I got all the methods now, However,I find myself unable to throw an exception when the polynomial is an empty list as the base case of my recursive function is an empty list. Also, I don't know how to group common term together, Please help, Here are my codes:
type poly=Poly of (float*int) list
type term = Term of float *int
exception EmptyList
(*
let rec mergeCommonTerm(p:poly)=
let rec iterator ((a: float,b: int ), k: (float*int) list)=
match k with
|[]->(a,b)
|ki::kf-> if b= snd ki then (a+ fst ki,b)
match p with
|Poly [] -> Poly []
|Poly (a::af)-> match af with
|[]-> Poly [a]
|b::bf -> if snd a =snd b then Poly (fst a +fst b,snd a)::bf
else
*)
let rec multiplyPolyByTerm (x:term, p:poly)=
match x with
| Term (coe,deg) -> match p with
|Poly[] -> Poly []
|Poly (a::af) -> match multiplyPolyByTerm (x,Poly af) with
|Poly recusivep-> Poly ((fst a *coe,snd a + deg)::recusivep)
let rec addTermToPoly (x:term, p:poly)=
match x with
|Term (coe, deg)-> match p with
|Poly[] -> Poly [(coe,deg)]
|Poly (a::af)-> if snd a=deg then Poly ((fst a+coe,deg)::af)
else match addTermToPoly (x,Poly af) with
|Poly recusivep-> Poly (a::recusivep)
let rec addPolys (x:poly, y: poly)=
match x with
|Poly []->y
|Poly (xh::xt)-> addPolys(Poly xt,addTermToPoly(Term xh, y))
let rec multPolys (x:poly,y:poly)=
match x with
|Poly []-> Poly[]
|Poly (xh::xt)->addPolys (multiplyPolyByTerm(Term xh,y),multPolys(Poly xt,y))
let evalTerm (values:float) (termmm : term) :float=
match termmm with
|Term (coe,deg)->coe*(values**float(deg))
let rec evalPoly (polyn : poly, v: float) :float=
match polyn with
|Poly []->0.0
|Poly (ph::pt)-> (evalTerm v (Term ph)) + evalPoly (Poly pt,v)
let rec diffPoly (p:poly) :poly=
match p with
|Poly []->Poly []
|Poly (ah::at)-> match diffPoly (Poly at) with
|Poly [] -> if snd ah = 0 then Poly []
else Poly [(float(snd ah)*fst ah,snd ah - 1)]
|Poly (bh::bt)->Poly ((float(snd ah)*fst ah,snd ah - 1)::bh::bt)
As I mentioned in a comment, reading https://fsharpforfunandprofit.com/posts/discriminated-unions/ will be very helpful for you. But let me give you some quick help to get you unstuck and starting to solve your immediate problems. You're on the right track, you're just struggling a little with the syntax (and operator precedence, which is part of the syntax).
First, load the MSDN operator precedence documentation in another tab while you read the rest of this answer. You'll want to look at it later on, but first I'll explain a subtlety of how F# treats discriminated unions that you probably haven't understood yet.
When you define a discriminated union type like poly, the name Poly acts like a constructor for the type. In F#, constructors are functions. So when you write Poly (something), the F# parser interprets this as "take the value (something) and pass it to the function named Poly". Here, the function Poly isn't one you had to define explicitly; it was implicitly defined as part of your type definition. To really make this clear, consider this example:
type Example =
| Number of int
| Text of string
5 // This has type int
Number 5 // This has type Example
Number // This has type (int -> Example), i.e. a function
"foo" // This has type string
Text "foo" // This has type Example
Text // This has type (string -> Example), i.e. a function
Now look at the operator precedence list that you loaded in another tab. Lowest precedence is at the top of the table, and highest precedence is at the bottom; in other words, the lower something is on the table, the more "tightly" it binds. As you can see, function application (f x, calling f with parameter x) binds very tightly, more tightly than the :: operator. So when you write f a::b, that is not read as f (a::b), but rather as (f a)::b. In other words, f a::b reads as "Item b is a list of some type which we'll call T, and the function call f a produces an item of type T that should go in front of list b". If you instead meant "take the list formed by putting item a at the head of list b, and then call f with the resulting list", then that needs parentheses: you have to write f (a::b) to get that meaning.
So when you write Poly a::af, that's interpreted as (Poly a)::af, which means "Here is a list. The first item is a Poly a, which means that a is a (float * int) list. The rest of the list will be called af". And since the value your passing into it is not a list, but rather a poly type, that is a type mismatch. (Note that items of type poly contain lists, but they are not themselves lists). What you needed to write was Poly (a::af), which would have meant "Here is an item of type poly that contains a list. That list should be split into the head, a, and the rest, af."
I hope that helped rather than muddle the waters further. If you didn't understand any part of this, let me know and I'll try to make it clearer.
P.S. Another point of syntax you might want to know: F# gives you many ways to signal an error condition (like an empty list in this assignment), but your professor has asked you to use exception EmptyList when invalid input is given. That means he expects your code to "throw" or "raise" an exception when you encounter an error. In C# the term is "throw", but in F# the term is "raise", and the syntax looks like this:
if someErrorCondition then
raise EmptyList
// Or ...
match listThatShouldNotBeEmpty with
| [] -> raise EmptyList
| head::rest -> // Do something with head, etc.
That should take care of the next question you would have needed to ask. :-)
Update 2: You've edited your question to clarify another issue you're having, where your recursive function boils down to an empty list as the base case — yet your professor asked you to consider an empty list as an invalid input. There are two ways to solve this. I'll discuss the more complicated one first, then I'll discuss the easier one.
The more complicated way to solve this is to have two separate functions, an "outer" one and an "inner" one, for each of the functions you have been asked to define. In each case, the "outer" one checks whether the input is an empty list and throws an exception if that's the case. If the input is not an empty list, then it passes the input to the "inner" function, which does the recursive algorithm (and does NOT consider an empty list to be an error). So the "outer" function is basically only doing error-checking, and the "inner" function is doing all the work. This is a VERY common approach in professional programming, where all your error-checking is done at the "edges" of your code, while the "inner" code never has to deal with errors. It's therefore a good approach to know about — but in your particular case, I think it's more complicated than you need.
The easier solution is to rewrite your functions to consider a single-item list as the base case, so that your recursive functions never go all the way to an empty list. Then you can always consider an empty list to be an error. Since this is homework I won't give you an example based on your actual code, but rather an example based on a simple "take the sum of a list of integers" exercise where an empty list would be considered an error:
let rec sumNonEmptyList (input : int list) : int =
match input with
| [] -> raise EmptyList
| [x] -> x
| x::rest -> x + sumNonEmptyList rest
The syntax [x] in a match expression means "This matches a list with exactly one item in it, and assigns the name x to the value of that item". In your case, you'd probably be matching against Poly [] to raise an exception, Poly [a] as the base case, and Poly (a::af) as the "more than one item" case. (That's as much of a clue as I think I should give you; you'll learn better if you work out the rest yourself).

F# on List of Elements

I am trying to write a F# function that finds the biggest value. I am new to F# and am confused as to how to implement this with the correct type and recursion.
Any help would be greatly appreciated along with an explanation of how it works, I really need to understand how it works so I can attempt to create other F# functions. Thanks!
When creating recursive functions, start thinking about the corner cases. Your helper function takes a list and a "maximum so far". Corner cases: What if your list is empty? What if you only have a 1 element list, or focus on the first element? That directly translates into a match statement:
let rec helper (l, m) =
match l, m with
| [], m -> m
| (l1 :: rest), m ->
let max1 = if l1 > m then l1 else m
helper(rest, max1)
I'll leave the wrapper findMax open, but clearly you can solve that using the same thinking: What if you get an empty list? (scream!) What if you get a list with elements (the first element is your maximum so far, feed the rest of the list into your helper)
And of course you could put it all into one function. I've chosen this rather roundabout helper because your template code was shaped in that way.
The first thing to do is to start thinking recursively and/or mathematically. In most general vague terms, it should look like "The result of my function is..." - then try to actually put into words what the result should be.
Applying to your particular problem, I would phrase it like this:
when given a list of one element, the result of findMax is that element.
when given a list of more than one element, the result of findMax is the maximum of the lists's head and the maximum element of its tail.
This thinking can be translated into F# almost word for word:
let rec findMax list =
match list with
| [x] -> x
| head::tail -> max head (findMax tail)
where:
let max a b = if a > b then a else b
Note, however, that this function is incomplete: it doesn't specify what the result should be when given an empty list. I will leave this as an exercise for the reader.

How to write efficient list/seq functions in F#? (mapFoldWhile)

I was trying to write a generic mapFoldWhile function, which is just mapFold but requires the state to be an option and stops as soon as it encounters a None state.
I don't want to use mapFold because it will transform the entire list, but I want it to stop as soon as an invalid state (i.e. None) is found.
This was myfirst attempt:
let mapFoldWhile (f : 'State option -> 'T -> 'Result * 'State option) (state : 'State option) (list : 'T list) =
let rec mapRec f state list results =
match list with
| [] -> (List.rev results, state)
| item :: tail ->
let (result, newState) = f state item
match newState with
| Some x -> mapRec f newState tail (result :: results)
| None -> ([], None)
mapRec f state list []
The List.rev irked me, since the point of the exercise was to exit early and constructing a new list ought to be even slower.
So I looked up what F#'s very own map does, which was:
let map f list = Microsoft.FSharp.Primitives.Basics.List.map f list
The ominous Microsoft.FSharp.Primitives.Basics.List.map can be found here and looks like this:
let map f x =
match x with
| [] -> []
| [h] -> [f h]
| (h::t) ->
let cons = freshConsNoTail (f h)
mapToFreshConsTail cons f t
cons
The consNoTail stuff is also in this file:
// optimized mutation-based implementation. This code is only valid in fslib, where mutation of private
// tail cons cells is permitted in carefully written library code.
let inline setFreshConsTail cons t = cons.(::).1 <- t
let inline freshConsNoTail h = h :: (# "ldnull" : 'T list #)
So I guess it turns out that F#'s immutable lists are actually mutable because performance? I'm a bit worried about this, having used the prepend-then-reverse list approach as I thought it was the "way to go" in F#.
I'm not very experienced with F# or functional programming in general, so maybe (probably) the whole idea of creating a new mapFoldWhile function is the wrong thing to do, but then what am I to do instead?
I often find myself in situations where I need to "exit early" because a collection item is "invalid" and I know that I don't have to look at the rest. I'm using List.pick or Seq.takeWhile in some cases, but in other instances I need to do more (mapFold).
Is there an efficient solution to this kind of problem (mapFoldWhile in particular and "exit early" in general) with functional programming concepts, or do I have to switch to an imperative solution / use a Collections.Generics.List?
In most cases, using List.rev is a perfectly sufficient solution.
You are right that the F# core library uses mutation and other dirty hacks to squeeze some more performance out of the F# list operations, but I think the micro-optimizations done there are not particularly good example. F# list functions are used almost everywhere so it might be a good trade-off, but I would not follow it in most situations.
Running your function with the following:
let l = [ 1 .. 1000000 ]
#time
mapFoldWhile (fun s v -> 0, s) (Some 1) l
I get ~240ms on the second line when I run the function without changes. When I just drop List.rev (so that it returns the data in the other order), I get around ~190ms. If you are really calling the function frequently enough that this matters, then you'd have to use mutation (actually, your own mutable list type), but I think that is rarely worth it.
For general "exit early" problems, you can often write the code as a composition of Seq.scan and Seq.takeWhile. For example, say you want to sum numbers from a sequence until you reach 1000. You can write:
input
|> Seq.scan (fun sum v -> v + sum) 0
|> Seq.takeWhile (fun sum -> sum < 1000)
Using Seq.scan generates a sequence of sums that is over the whole input, but since this is lazily generated, using Seq.takeWhile stops the computation as soon as the exit condition happens.

Comparing values in loop inside function

I want to make a function that takes an integer list as argument and compares every value and returns the largest value. In C# I would simply iterate through every value in the list, save the largest to a variable and return it, I'm hoping F# works similarly but the syntax is kinda iffy for me, here's what my code looks like. Also max2 is a function that compares 2 values and returns the largest.
let max_list list =
let a = 0 : int
match list with
| head :: tail -> (for i in list do a = max2 i a) a
| [] -> failwith "sry";;
You could use mutable variable and write the code using for loop, just like in C#. However, if you're doing this to learn F# and functional concepts, then it's good idea to use recursion.
In this case, recursive function is a bit longer, but it demonstrates the key concepts including pattern matching - so learning the tricks is something that will be useful when writing more complicated F# code.
The key idea is to write a function that takes the largest value found so far and calls itself recursively until it reaches the end of the list.
let max_list list =
// Inner recursive function that takes the largest value found so far
// and a list to be processed (if it is empty, it returns 'maxSoFar')
let rec loop maxSoFar list =
match list with
// If the head value is greater than what we found so far, use it as new greater
| head::tail when head > maxSoFar -> loop head tail
// If the head is smaller, use the previous maxSoFar value
| _::tail -> loop maxSoFar tail
// At the end, just return the largest value found so far
| [] -> maxSoFar
// Start with head as the greatest and tail as the rest to be processed
// (fails for empty list - but you could match here to give better error)
loop (List.head list) (List.tail list)
As a final note, this will be slow because it uses generic comparison (via an interface). You can make the function faster using let inline max_list list = (...). That way, the code will use native comparison instruction when used with primitive types like int (this is really a special case - the problem only really happens with generic comparison)
Also know that you can write a nice one-liner using reduce:
let max_list list = List.reduce (fun max x -> if x > max then x else max)
If your intention is to be able to find the maximum value of items in a list where the value of the items is found by the function max2 then this approach works:
let findMax list =
list
|> List.map (fun i -> i, max2 i)
|> List.maxBy snd
|> fst

How do I know if a function is tail recursive in F#

I wrote the follwing function:
let str2lst str =
let rec f s acc =
match s with
| "" -> acc
| _ -> f (s.Substring 1) (s.[0]::acc)
f str []
How can I know if the F# compiler turned it into a loop? Is there a way to find out without using Reflector (I have no experience with Reflector and I Don't know C#)?
Edit: Also, is it possible to write a tail recursive function without using an inner function, or is it necessary for the loop to reside in?
Also, Is there a function in F# std lib to run a given function a number of times, each time giving it the last output as input? Lets say I have a string, I want to run a function over the string then run it again over the resultant string and so on...
Unfortunately there is no trivial way.
It is not too hard to read the source code and use the types and determine whether something is a tail call by inspection (is it 'the last thing', and not in a 'try' block), but people second-guess themselves and make mistakes. There's no simple automated way (other than e.g. inspecting the generated code).
Of course, you can just try your function on a large piece of test data and see if it blows up or not.
The F# compiler will generate .tail IL instructions for all tail calls (unless the compiler flags to turn them off is used - used for when you want to keep stack frames for debugging), with the exception that directly tail-recursive functions will be optimized into loops. (EDIT: I think nowadays the F# compiler also fails to emit .tail in cases where it can prove there are no recursive loops through this call site; this is an optimization given that the .tail opcode is a little slower on many platforms.)
'tailcall' is a reserved keyword, with the idea that a future version of F# may allow you to write e.g.
tailcall func args
and then get a warning/error if it's not a tail call.
Only functions that are not naturally tail-recursive (and thus need an extra accumulator parameter) will 'force' you into the 'inner function' idiom.
Here's a code sample of what you asked:
let rec nTimes n f x =
if n = 0 then
x
else
nTimes (n-1) f (f x)
let r = nTimes 3 (fun s -> s ^ " is a rose") "A rose"
printfn "%s" r
I like the rule of thumb Paul Graham formulates in On Lisp: if there is work left to do, e.g. manipulating the recursive call output, then the call is not tail recursive.

Resources