Converting F# pipeline operators ( <|, >>, << ) to OCaml - f#

I'm converting some F# code to OCaml and I see a lot of uses of this pipeline operator <|, for example:
let printPeg expr =
printfn "%s" <| pegToString expr
The <| operator is apparently defined as just:
# let ( <| ) a b = a b ;;
val ( <| ) : ('a -> 'b) -> 'a -> 'b = <fun>
I'm wondering why they bother to define and use this operator in F#, is it just so they can avoid putting in parens like this?:
let printPeg expr =
Printf.printf "%s" ( pegToString expr )
As far as I can tell, that would be the conversion of the F# code above to OCaml, correct?
Also, how would I implement F#'s << and >> operators in Ocaml?
( the |> operator seems to simply be: let ( |> ) a b = b a ;; )

why they bother to define and use this operator in F#, is it just so they can avoid putting in parens?
It's because the functional way of programming assumes threading a value through a chain of functions. Compare:
let f1 str server =
str
|> parseUserName
|> getUserByName server
|> validateLogin <| DateTime.Now
let f2 str server =
validateLogin(getUserByName(server, (parseUserName str)), DateTime.Now)
In the first snippet, we clearly see everything that happens with the value. Reading the second one, we have to go through all parens to figure out what's going on.
This article about function composition seems to be relevant.
So yes, in a regular life, it is mostly about parens. But also, pipeline operators are closely related to partial function application and point-free style of coding. See Programming is "Pointless", for example.
The pipeline |> and function composition >> << operators can produce yet another interesting effect when they are passed to higher-level functions, like here.

Directly from the F# source:
let inline (|>) x f = f x
let inline (||>) (x1,x2) f = f x1 x2
let inline (|||>) (x1,x2,x3) f = f x1 x2 x3
let inline (<|) f x = f x
let inline (<||) f (x1,x2) = f x1 x2
let inline (<|||) f (x1,x2,x3) = f x1 x2 x3
let inline (>>) f g x = g(f x)
let inline (<<) f g x = f(g x)

OCaml Batteries supports these operators, but for reasons of precedence, associativity and other syntactic quirks (like Camlp4) it uses different symbols. Which particular symbols to use has just been settled recently, there are some changes. See: Batteries API:
val (|>) : 'a -> ('a -> 'b) -> 'b
Function application. x |> f is equivalent to f x.
val ( **> ) : ('a -> 'b) -> 'a -> 'b
Function application. f **> x is equivalent to f x.
Note The name of this operator is not written in stone. It is bound to change soon.
val (|-) : ('a -> 'b) -> ('b -> 'c) -> 'a -> 'c
Function composition. f |- g is fun x -> g (f x). This is also equivalent to applying <** twice.
val (-|) : ('a -> 'b) -> ('c -> 'a) -> 'c -> 'b
Function composition. f -| g is fun x -> f (g x). Mathematically, this is operator o.
But Batteries trunk provides:
val ( ## ) : ('a -> 'b) -> 'a -> 'b
Function application. [f ## x] is equivalent to [f x].
val ( % ) : ('a -> 'b) -> ('c -> 'a) -> 'c -> 'b
Function composition: the mathematical [o] operator.
val ( |> ) : 'a -> ('a -> 'b) -> 'b
The "pipe": function application. [x |> f] is equivalent to [f x].
val ( %> ) : ('a -> 'b) -> ('b -> 'c) -> 'a -> 'c
Piping function composition. [f %> g] is [fun x -> g (f x)].

I'm wondering why they bother to define and use this operator in F#, is it just so they can avoid putting in parens like this?
Excellent question. The specific operator you're referring to (<|) is pretty useless IME. It lets you avoid parentheses on rare occasions but more generally it complicates the syntax by dragging in more operators and makes it harder for less experienced F# programmers (of which there are now many) to understand your code. So I've stopped using it.
The |> operator is much more useful but only because it helps F# to infer types correctly in situations where OCaml would not have a problem. For example, here is some OCaml:
List.map (fun o -> o#foo) os
The direct equivalent fails in F# because the type of o cannot be inferred prior to reading its foo property so the idiomatic solution is to rewrite the code like this using the |> so F# can infer the type of o before foo is used:
os |> List.map (fun o -> o.foo)
I rarely use the other operators (<< and >>) because they also complicate the syntax. I also dislike parser combinator libraries that pull in lots of operators.
The example Bytebuster gave is interesting:
let f1 str server =
str
|> parseUserName
|> getUserByName server
|> validateLogin <| DateTime.Now
I would write this as:
let f2 str server =
let userName = parseUserName str
let user = getUserByName server userName
validateLogin user DateTime.Now
There are no brackets in my code. My temporaries have names so they appear in the debugger and I can inspect them and Intellisense can give me type throwback when I hover the mouse over them. These characteristics are valuable for production code that non-expert F# programmers will be maintaining.

Related

Dynamic functions in F#

I'm trying to explore the dynamic capabilities of F# for situations where I can't express some function with the static type system. As such, I'm trying to create a mapN function for (say) Option types, but I'm having trouble creating a function with a dynamic number of arguments. I've tried:
let mapN<'output> (f : obj) args =
let rec mapN' (state:obj) (args' : (obj option) list) =
match args' with
| Some x :: xs -> mapN' ((state :?> obj -> obj) x) xs
| None _ :: _ -> None
| [] -> state :?> 'output option
mapN' f args
let toObjOption (x : #obj option) =
Option.map (fun x -> x :> obj) x
let a = Some 5
let b = Some "hi"
let c = Some true
let ans = mapN<string> (fun x y z -> sprintf "%i %s %A" x y z) [a |> toObjOption; b |> toObjOption; c |> toObjOption]
(which takes the function passed in and applies one argument at a time) which compiles, but then at runtime I get the following:
System.InvalidCastException: Unable to cast object of type 'ans#47' to type
'Microsoft.FSharp.Core.FSharpFunc`2[System.Object,System.Object]'.
I realize that it would be more idiomatic to either create a computation expression for options, or to define map2 through map5 or so, but I specifically want to explore the dynamic capabilities of F# to see whether something like this would be possible.
Is this just a concept that can't be done in F#, or is there an approach that I'm missing?
I think you would only be able to take that approach with reflection.
However, there are other ways to solve the overall problem without having to go dynamic or use the other static options you mentioned. You can get a lot of the same convenience using Option.apply, which you need to define yourself (or take from a library). This code is stolen and adapted from F# for fun and profit:
module Option =
let apply fOpt xOpt =
match fOpt,xOpt with
| Some f, Some x -> Some (f x)
| _ -> None
let resultOption =
let (<*>) = Option.apply
Some (fun x y z -> sprintf "%i %s %A" x y z)
<*> Some 5
<*> Some "hi"
<*> Some true
To explain why your approach does not work, the problem is that you cannot cast a function of type int -> int (represented as FSharpFunc<int, int>) to a value of type obj -> obj (represented as FSharpFunc<obj, obj>). The types are the same generic types, but the cast fails because the generic parameters are different.
If you insert a lot of boxing and unboxing, then your function actually works, but this is probably not something you want to write:
let ans = mapN<string> (fun (x:obj) -> box (fun (y:obj) -> box (fun (z:obj) ->
box (Some(sprintf "%i %s %A" (unbox x) (unbox y) (unbox z))))))
[a |> toObjOption; b |> toObjOption; c |> toObjOption]
If you wanted to explore more options possible thanks to dynamic hacks - then you can probably do more using F# reflection. I would not typically use this in production (simple is better - I'd just define multiple map functions by hand or something like that), but the following runs:
let rec mapN<'R> f args =
match args with
| [] -> unbox<'R> f
| x::xs ->
let m = f.GetType().GetMethods() |> Seq.find (fun m ->
m.Name = "Invoke" && m.GetParameters().Length = 1)
mapN<'R> (m.Invoke(f, [| x |])) xs
mapN<obj> (fun a b c -> sprintf "%d %s %A" a b c) [box 1; box "hi"; box true]

Built in f# operator to compose functions with the same input but different outputs?

I understand the << compose operator takes two functions that both take in and return the same type. e.g. (lhs:'a -> 'a) -> (rhs:'a -> 'a) -> 'a
I often find myself wanting something like (lhs:'a -> 'b) -> (rhs:'c -> 'b) -> 'b in cases where I'm interested in side affects and not the return value 'b is probably the unit type. This is only when I have two lines in succession where I'm persisting something to a database.
Is there a built in function or idiomatic F# way of doing this without writing something like
let myCompose lhs rhs arg =
lhs arg
rhs arg
Backward composition operator (<<) is defined as:
( << ) : ('b -> 'c) -> ('a -> 'b) -> 'a -> 'c`
With two predicates applied, it is actually a function that takes initial value of 'a returning 'c, while the value of 'b is processed inside.
From the code sample you provided, let me assume that you need applying an argument to both predicates. There are several ways to do this:
Discarding the value returned by the (first) predicate, returning the original argument instead. Such operator exists in WebSharper:
let ( |>! ) x f = f x; x
// Usage:
let ret =
x
|>! f1
|>! f2
|> f3
I like this approach because:
it does not complicate things; each function application is atomic, and the code appears more readable;
it allows chaining throughout three or more predicates, like in the example above;
In this case, f must return unit, but you can easily work this around:
let ( |>!! ) x f = ignore(f x); x
Applying the argument to both predicates, returning a tuple of results, exactly as in your own example. There's such operator OCaml, easy to adapt to F#:
val (&&&) : ('a -> 'b) -> ('a -> 'c) -> 'a -> 'b * 'c
As #JackP noticed, &&& is already defined in F# for another purpose, so let's use another name:
/// Applying two functions to the same argument.
let (.&.) f g x = (f x, g x)
// Usage
let ret1, ret2 =
x
|> (f .&. g)
Note The samples above are for straight order of function application. If you need them applied in a reverse order, you need to modify the code accordingly.
The backward or reverse composition operator (<<) does not take two functions that both take in and return the same type; the only constraint is that the output type of the first function to be applied must be the same as the input type of the function it's being composed into. According to MSDN, the function signature is:
// Signature:
( << ) : ('T2 -> 'T3) -> ('T1 -> 'T2) -> 'T1 -> 'T3
// Usage:
func2 << func1
I don't know of a built-in composition operator that works like you want, but if this pattern is something you use frequently in your code and having such an operator would simplify your code, I think it's reasonable to define your own. For example:
> let (<<!) func2 func1 arg = func1 arg; func2 arg;;
val ( <<! ) : func2:('a -> 'b) -> func1:('a -> unit) -> arg:'a -> 'b
Or, if you know both functions are going to return unit, you can write it like this to constrain the output type to be unit:
> let (<<!) func2 func1 arg = func1 arg; func2 arg; ();;
val ( <<! ) : func2:('a -> unit) -> func1:('a -> unit) -> arg:'a -> unit
For composing of any number of functions of type f:'a->unit in any desired order you may simply fold their list:
("whatever",[ printfn "funX: %A"; printfn "funY: %A"; printfn "funZ: %A" ])
||> List.fold (fun arg f -> f arg; arg )
|> ignore
getting in FSI
funX: "whatever"
funY: "whatever"
funZ: "whatever"
val it : unit = ()

How to rewrite this function using the pipeline operator

These are the function definitions.
func1: 'a -> unit
func2: 'b -> 'a
func3: string -> 'b list
The current function
let f = Seq.iter((fun a -> func1(func2 a)) func3(s)
This is as far as I got
let f = func3(s)
|> ((fun a -> func2 a
|> func1)
|> Seq.iter)
I have the feeling it should be possible to loose the lambda and the parens'.
You can do without pipes, simply
Seq.iter (func1 << func2) << func3
(this is a function with some arguments [same than func3] and same output than Seq.iter).
You can test it
let func1 x = printfn "Number: %d" x
let func2 (a, b) = a + b
let func3 = Seq.map (fun n -> (n, 2 * n))
let f : (seq<_> -> unit) = Seq.iter (func1 << func2) << func3
f [1..5]
with output
Number: 3
Number: 6
Number: 9
Number: 12
Number: 15
val func1 : x:int -> unit
val func2 : a:int * b:int -> int
val func3 : (seq<int> -> seq<int * int>)
val f : (seq<int> -> unit)
val it : unit = ()
:)
You can use function composition operator >>:
func3() |> Seq.iter (func2 >> func1)
I think the question is, why do you want to use the pipeline operator?
I find your original code quite readable. You should not try to use pipeline operator (or function composition) just for the sake of using them. Now, in your code, the input s comes at the end, which is a bit unfortunate (you cannot quite see what is the main input for the code). I would probably rewrite it as (also, s is not really a descriptive name):
s |> func3
|> Seq.iter (fun a -> func1 (func2 a))
You can use function composition too - but I do not use it very often, because it does not (always) help with readability. But using it in the argument of Seq.iter is probably quite reasonable.
On a completely unrelated note, you could just use for loop and write:
for a in func3 s do
func1 (func2 a)
I actually find this more readable than any other version of the code here (if F# gives you a language feature for iterating over sequences that does exactly what you need, why not use it?)

Packrat parsing (memoization via laziness) in OCaml

I'm implementing a packrat parser in OCaml, as per the Master Thesis by B. Ford. My parser should receive a data structure that represents the grammar of a language and parse given sequences of symbols.
I'm stuck with the memoization part. The original thesis uses Haskell's lazy evaluation to accomplish linear time complexity. I want to do this (memoization via laziness) in OCaml, but don't know how to do it.
So, how do you memoize functions by lazy evaluations in OCaml?
EDIT: I know what lazy evaluation is and how to exploit it in OCaml. The question is how to use it to memoize functions.
EDIT: The data structure I wrote that represents grammars is:
type ('a, 'b, 'c) expr =
| Empty of 'c
| Term of 'a * ('a -> 'c)
| NTerm of 'b
| Juxta of ('a, 'b, 'c) expr * ('a, 'b, 'c) expr * ('c -> 'c -> 'c)
| Alter of ('a, 'b, 'c) expr * ('a, 'b, 'c) expr
| Pred of ('a, 'b, 'c) expr * 'c
| NPred of ('a, 'b, 'c) expr * 'c
type ('a, 'b, 'c) grammar = ('a * ('a, 'b, 'c) expr) list
The (not-memoized) function that parse a list of symbols is:
let rec parse g v xs = parse' g (List.assoc v g) xs
and parse' g e xs =
match e with
| Empty y -> Parsed (y, xs)
| Term (x, f) ->
begin
match xs with
| x' :: xs when x = x' -> Parsed (f x, xs)
| _ -> NoParse
end
| NTerm v' -> parse g v' xs
| Juxta (e1, e2, f) ->
begin
match parse' g e1 xs with
| Parsed (y, xs) ->
begin
match parse' g e2 xs with
| Parsed (y', xs) -> Parsed (f y y', xs)
| p -> p
end
| p -> p
end
( and so on )
where the type of the return value of parse is defined by
type ('a, 'c) result = Parsed of 'c * ('a list) | NoParse
For example, the grammar of basic arithmetic expressions can be specified as g, in:
type nt = Add | Mult | Prim | Dec | Expr
let zero _ = 0
let g =
[(Expr, Juxta (NTerm Add, Term ('$', zero), fun x _ -> x));
(Add, Alter (Juxta (NTerm Mult, Juxta (Term ('+', zero), NTerm Add, fun _ x -> x), (+)), NTerm Mult));
(Mult, Alter (Juxta (NTerm Prim, Juxta (Term ('*', zero), NTerm Mult, fun _ x -> x), ( * )), NTerm Prim));
(Prim, Alter (Juxta (Term ('<', zero), Juxta (NTerm Dec, Term ('>', zero), fun x _ -> x), fun _ x -> x), NTerm Dec));
(Dec, List.fold_left (fun acc d -> Alter (Term (d, (fun c -> int_of_char c - 48)), acc)) (Term ('0', zero)) ['1';'2';'3';])]
The idea of using lazyness for memoization is use not functions, but data structures, for memoization. Lazyness means that when you write let x = foo in some_expr, foo will not be evaluated immediately, but only as far as some_expr needs it, but that different occurences of xin some_expr will share the same trunk: as soon as one of them force computation, the result is available to all of them.
This does not work for functions: if you write let f x = foo in some_expr, and call f several times in some_expr, well, each call will be evaluated independently, there is not a shared thunk to store the results.
So you can get memoization by using a data structure instead of a function. Typically, this is done using an associative data structure: instead of computing a a -> b function, you compute a Table a b, where Table is some map from the arguments to the results. One example is this Haskell presentation of fibonacci:
fib n = fibTable !! n
fibTable = [0,1] ++ map (\n -> fib (n - 1) + fib (n - 2)) [2..]
(You can also write that with tail and zip, but this doesn't make the point clearer.)
See that you do not memoize a function, but a list: it is the list fibTable that does the memoization. You can write this in OCaml as well, for example using the LazyList module of the Batteries library:
open Batteries
module LL = LazyList
let from_2 = LL.seq 2 ((+) 1) (fun _ -> true)
let rec fib n = LL.at fib_table (n - 1) + LL.at fib_table (n - 2)
and fib_table = lazy (LL.Cons (0, LL.cons 1 <| LL.map fib from_2))
However, there is little interest in doing so: as you have seen in the example above, OCaml does not particularly favor call-by-need evaluation -- it's reasonable to use, but not terribly convenient as it was forced to be in Haskell. It is actually equally simple to directly write the cache structure by direct mutation:
open Batteries
let fib =
let fib_table = DynArray.of_list [0; 1] in
let get_fib n = DynArray.get fib_table n in
fun n ->
for i = DynArray.length fib_table to n do
DynArray.add fib_table (get_fib (i - 1) + get_fib (i - 2))
done;
get_fib n
This example may be ill-chosen, because you need a dynamic structure to store the cache. In the packrat parser case, you're tabulating parsing on a known input text, so you can use plain arrays (indexed by the grammar rules): you would have an array of ('a, 'c) result option for each rule, of the size of the input length and initialized to None. Eg. juxta.(n) represents the result of trying the rule Juxta from input position n, or None if this has not yet been tried.
Lazyness is a nice way to present this kind of memoization, but is not always expressive enough: if you need, say, to partially free some part of your result cache to lower memory usage, you will have difficulties if you started from a lazy presentation. See this blog post for a remark on this.
Why do you want to memoize functions? What you want to memoize is, I believe, the parsing result for a given (parsing) expression and a given position in the input stream. You could for instance use Ocaml's Hashtables for that.
The lazy keyword.
Here you can find some great examples.
If it fits your use case, you can also use OCaml streams instead of manually generating thunks.

How do you curry the 2nd (or 3rd, 4th, ...) parameter in F# or any functional language?

I'm just starting up with F# and see how you can use currying to pre-load the 1st parameter to a function. But how would one do it with the 2nd, 3rd, or whatever other parameter? Would named parameters to make this easier? Are there any other functional languages that have named parameters or some other way to make currying indifferent to parameter-order?
Typically you just use a lambda:
fun x y z -> f x y 42
is a function like 'f' but with the third parameter bound to 42.
You can also use combinators (like someone mentioned Haskell's "flip" in a comment), which reorder arguments, but I sometimes find that confusing.
Note that most curried functions are written so that the argument-most-likely-to-be-partially-applied comes first.
F# has named parameters for methods (not let-bound function values), but the names apply to 'tupled' parameters. Named curried parameters do not make much sense; if I have a two-argument curried function 'f', I would expect that given
let g = f
let h x y = f x y
then 'g' or 'h' would be substitutable for 'f', but 'named' parameters make this not necessarily true. That is to say, 'named parameters' can interact poorly with other aspects of the language design, and I personally don't know of a good design offhand for 'named parameters' that interacts well with 'first class curried function values'.
OCaml, the language that F# was based on, has labeled (and optional) arguments that can be specified in any order, and you can partially apply a function based on those arguments' names. I don't believe F# has this feature.
You might try creating something like Haskell's flip function. Creating variants that jump the argument further in the argument list shouldn't be too hard.
let flip f a b = f b a
let flip2 f a b c = f b c a
let flip3 f a b c d = f b c d a
Just for completeness - and since you asked about other functional languages - this is how you would do it in OCaml, arguably the "mother" of F#:
$ ocaml
# let foo ~x ~y = x - y ;;
val foo : x:int -> y:int -> int = <fun>
# foo 5 3;;
- : int = 2
# let bar = foo ~y:3;;
val bar : x:int -> int = <fun>
# bar 5;;
- : int = 2
So in OCaml you can hardcode any named parameter you want, just by using its name (y in the example above).
Microsoft chose not to implement this feature, as you found out... In my humble opinion, it's not about "poor interaction with other aspects of the language design"... it is more likely because of the additional effort this would require (in the language implementation) and the delay it would cause in bringing the language to the world - when in fact only few people would (a) be aware of the "stepdown" from OCaml, (b) use named function arguments anyway.
I am in the minority, and do use them - but it is indeed something easily emulated in F# with a local function binding:
let foo x y = x - y
let bar x = foo x 3
bar ...
It's possible to do this without declaring anything, but I agree with Brian that a lambda or a custom function is probably a better solution.
I find that I most frequently want this for partial application of division or subtraction.
> let halve = (/) >> (|>) 2.0;;
> let halfPi = halve System.Math.PI;;
val halve : (float -> float)
val halfPi : float = 1.570796327
To generalize, we can declare a function applySecond:
> let applySecond f arg2 = f >> (|>) arg2;;
val applySecond : f:('a -> 'b -> 'c) -> arg2:'b -> ('a -> 'c)
To follow the logic, it might help to define the function thus:
> let applySecond f arg2 =
- let ff = (|>) arg2
- f >> ff;;
val applySecond : f:('a -> 'b -> 'c) -> arg2:'b -> ('a -> 'c)
Now f is a function from 'a to 'b -> 'c. This is composed with ff, a function from 'b -> 'c to 'c that results from the partial application of arg2 to the forward pipeline operator. This function applies the specific 'b value passed for arg2 to its argument. So when we compose f with ff, we get a function from 'a to 'c that uses the given value for the 'b argument, which is just what we wanted.
Compare the first example above to the following:
> let halve f = f / 2.0;;
> let halfPi = halve System.Math.PI;;
val halve : f:float -> float
val halfPi : float = 1.570796327
Also compare these:
let filterTwoDigitInts = List.filter >> (|>) [10 .. 99]
let oddTwoDigitInts = filterTwoDigitInts ((&&&) 1 >> (=) 1)
let evenTwoDigitInts = filterTwoDigitInts ((&&&) 1 >> (=) 0)
let filterTwoDigitInts f = List.filter f [10 .. 99]
let oddTwoDigitInts = filterTwoDigitInts (fun i -> i &&& 1 = 1)
let evenTwoDigitInts = filterTwoDigitInts (fun i -> i &&& 1 = 0)
Alternatively, compare:
let someFloats = [0.0 .. 10.0]
let theFloatsDividedByFour1 = someFloats |> List.map ((/) >> (|>) 4.0)
let theFloatsDividedByFour2 = someFloats |> List.map (fun f -> f / 4.0)
The lambda versions seem to be easier to read.
In Python, you can use functools.partial, or a lambda. Python has named arguments.
functools.partial can be used to specify the first positional arguments as well as any named argument.
from functools import partial
def foo(a, b, bar=None):
...
f = partial(foo, bar='wzzz') # f(1, 2) ~ foo(1, 2, bar='wzzz')
f2 = partial(foo, 3) # f2(5) ~ foo(3, 5)
f3 = lambda a: foo(a, 7) # f3(9) ~ foo(9, 7)

Resources