F# "do" statements as Block expressions - f#

The F# spec describes in 10.2.5 do statements in Modules, that on the module level a do statement may have attributes, and it will produce a warning, when the type of its expression does not evaluate to unit. Apparently, the behaviour of a do statement in other positions is akin to a Block expression (6.5.1), either a parenthesized or a begin-end-block, while still asserting the unit type.
This makes the do statement suitable for controlling the scope of Deterministic disposal expressions (6.6.4):
let d x =
printf "new %s, " x
{ new System.IDisposable with
member __.Dispose() = printf "disposing %s, " x }
let ab() =
use a = d "a"
use b = d "b"
printf "a + b, "
ab(); printfn ""
// prints 'new a, new b, a + b, disposing b, disposing a, '
let aba() =
use a = d "a"
do use b = d "b"
printf "a + b, "
printf "a, "
aba(); printfn ""
// prints 'new a, new b, a + b, disposing b, a, disposing a, '
If it is not explicitly in the spec, is there anything else that keeps the maintainers of the F# compiler from breaking this usage?

I think that the aba function in your example is actually using both a do statement and a block expression. The syntax of do is do <expr> so the body of the do statement is a single expression. If we write the same thing with parentheses:
let aba() =
use a = d "a"
do ( use b = d "b"
printf "a + b, " )
printf "a, "
The deterministic disposal behaviour then follows from the usual deterministic disposal behaviour of a block expression - rather than from some undocumented property of do blocks.

You don't actually need the do statement at all here, and using parentheses will be enough:
let aba() =
use a = d "a"
(
use b = d "b"
printf "a + b, "
)
printf "a, "
aba(); printfn ""
// Prints "new a, new b, a + b, disposing b, a, disposing a, "
This is Tomas Petricek's example minus the do. The parentheses define a new scope for the code block they contain, so when you reach the closing parenthesis, b goes out of scope and is disposed. To prove that the scoping rules work like this, let's add another variable definition:
let aba() =
use a = d "a"
(
use b = d "b"
let c = 5
printf "a + b, and c=%d " c // This compiles
)
printf "and now c=%d " c // This does not compile
printf "a, "
aba(); printfn ""
The first usage of c, inside the parentheses, will compile. The second use of c, outside the parentheses, will not compile since c is out of scope at that point, and you'll get:
error FS0039: The value or constructor 'c' is not defined.
So if you're using the do expression for the sake of controlling when the IDisposable goes out of scope, you can achieve the same result with parentheses alone. Up to you which form you find more readable.

Related

Is there a way to fix an expression with operators in it after parsing, using a table of associativities and precedences?

I'm currently working on a parser for a simple programming language written in Haskell. I ran into a problem when I tried to allow for binary operators with differing associativities and precedences. Normally this wouldn't be an issue, but since my language allows users to define their own operators, the precedence of operators isn't known by the compiler until the program has already been parsed.
Here are some of the data types I've defined so far:
data Expr
= Var String
| Op String Expr Expr
| ..
data Assoc
= LeftAssoc
| RightAssoc
| NonAssoc
type OpTable =
Map.Map String (Assoc, Int)
At the moment, the compiler parses all operators as if they were right-associative with equal precedence. So if I give it an expression like a + b * c < d the result will be Op "+" (Var "a") (Op "*" (Var "b") (Op "<" (Var "c") (Var "d"))).
I'm trying to write a function called fixExpr which takes an OpTable and an Expr and rearranges the Expr based on the associativities and precedences listed in the OpTable. For example:
operators :: OpTable
operators =
Map.fromList
[ ("<", (NonAssoc, 4))
, ("+", (LeftAssoc, 6))
, ("*", (LeftAssoc, 7))
]
expr :: Expr
expr = Op "+" (Var "a") (Op "*" (Var "b") (Op "<" (Var "c") (Var "d")))
fixExpr operators expr should evaluate to Op "<" (Op "+" (Var "a") (Op "*" (Var "b") (Var "c"))) (Var "d").
How do I define the fixExpr function? I've tried multiple solutions and none of them have worked.
An expression e may be an atomic term n (e.g. a variable or literal), a parenthesised expression, or an application of an infix operator ○.
e ⩴ n | (e​) | e1 ○ e2
We need the parentheses to know whether the user entered a * b + c, which we happen to associate as a * (b + c) and need to reassociate as (a * b) + c, or if they entered a * (b + c) literally, which should not be reassociated. Therefore I’ll make a small change to the data type:
data Expr
= Var String
| Group Expr
| Op String Expr Expr
| …
Then the method is simple:
The rebracketing of an expression ⟦e⟧ applies recursively to all its subexpressions.
⟦n⟧ = n
⟦(e)⟧ = (⟦e⟧)
⟦e1 ○ e2⟧ = ⦅⟦e1⟧ ○ ⟦e2⟧⦆
A single reassociation step ⦅e⦆ removes redundant parentheses on the right, and reassociates nested operator applications leftward in two cases: if the left operator has higher precedence, or if the two operators have equal precedence, and are both left-associative. It leaves nested infix applications alone, that is, associating rightward, in the opposite cases: if the right operator has higher precedence, or the operators have equal precedence and right associativity. If the associativities are mismatched, then the result is undefined.
⦅e ○ n⦆ = e ○ n
⦅e1 ○ (e2)⦆ = ⦅e1 ○ e2⦆
⦅e1 ○ (e2 ● e3)⦆ =
⦅e1 ○ e2⦆ ● e3, if:
a. P(○) > P(●); or
b. P(○) = P(●) and A(○) = A(●) = L
e1 ○ (e2 ● e3), if:
a. P(○) < P(●); or
b. P(○) = P(●) and A(○) = A(●) = R
undefined otherwise
NB.: P(o) and A(o) are respectively the precedence and associativity (L or R) of operator o.
This can be translated fairly literally to Haskell:
fixExpr operators = reassoc
where
-- 1.1
reassoc e#Var{} = e
-- 1.2
reassoc (Group e) = Group (reassoc e)
-- 1.3
reassoc (Op o e1 e2) = reassoc' o (reassoc e1) (reassoc e2)
-- 2.1
reassoc' o e1 e2#Var{} = Op o e1 e2
-- 2.2
reassoc' o e1 (Group e2) = reassoc' o e1 e2
-- 2.3
reassoc' o1 e1 r#(Op o2 e2 e3) = case compare prec1 prec2 of
-- 2.3.1a
GT -> assocLeft
-- 2.3.2a
LT -> assocRight
EQ -> case (assoc1, assoc2) of
-- 2.3.1b
(LeftAssoc, LeftAssoc) -> assocLeft
-- 2.3.2b
(RightAssoc, RightAssoc) -> assocRight
-- 2.3.3
_ -> error $ concat
[ "cannot mix ‘", o1
, "’ ("
, show assoc1
, " "
, show prec1
, ") and ‘"
, o2
, "’ ("
, show assoc2
, " "
, show prec2
, ") in the same infix expression"
]
where
(assoc1, prec1) = opInfo o1
(assoc2, prec2) = opInfo o2
assocLeft = Op o2 (Group (reassoc' o1 e1 e2)) e3
assocRight = Op o1 e1 r
opInfo op = fromMaybe (notFound op) (Map.lookup op operators)
notFound op = error $ concat
[ "no precedence/associativity defined for ‘"
, op
, "’"
]
Note the recursive call in assocLeft: by reassociating the operator applications, we may have revealed another association step, as in a chain of left-associative operator applications like a + b + c + d = (((a + b) + c) + d).
I insert Group constructors in the output for illustration, but they can be removed at this point, since they’re only necessary in the input.
This hasn’t been tested very thoroughly at all, but I think the idea is sound, and should accommodate modifications for more complex situations, even if the code leaves something to be desired.
An alternative that I’ve used is to parse expressions as “flat” sequences of operators applied to terms, and then run a separate parsing pass after name resolution, using e.g. Parsec’s operator precedence parser facility, which would handle these details automatically.

Confusing anonymous function construct

I'm reading through an F# tutorial, and ran into an example of syntax that I don't understand. The link to the page I'm reading is at the bottom. Here's the example from that page:
let rec quicksort2 = function
| [] -> []
| first::rest ->
let smaller,larger = List.partition ((>=) first) rest
List.concat [quicksort2 smaller; [first]; quicksort2 larger]
// test code
printfn "%A" (quicksort2 [1;5;23;18;9;1;3])
The part I don't understand is this: ((>=) first). What exactly is this? For contrast, this is an example from the MSDN documentation for List.partition:
let list1 = [ 1 .. 10 ]
let listEven, listOdd = List.partition (fun elem -> elem % 2 = 0) list1
printfn "Evens: %A\nOdds: %A" listEven listOdd
The first parameter (is this the right terminology?) to List.partition is obviously an anonymous function. I rewrote the line in question as this:
let smaller,larger = List.partition (fun e -> first >= e) rest
and it works the same as the example above. I just don't understand how this construct accomplishes the same thing: ((>=) first)
http://fsharpforfunandprofit.com/posts/fvsc-quicksort/
That's roughly the same thing as infix notation vs prefix notation
Operator are functions too and follow the same rule (ie. they can be partially applied)
So here (>=) first is the operator >= with first already applied as "first" operand, and gives back a function waiting for the second operand of the operator as you noticed when rewriting that line.
This construct combines two features: operator call with prefix notation and partial function application.
First, let's look at calling operators with prefix notation.
let x = a + b
The above code calls operator + with two arguments, a and b. Since this is a functional language, everything is a function, including operators, including operator +. It's just that operators have this funny call syntax, where you put the function between the arguments instead of in front of them. But you can still treat the operator just as any other normal function. To do that, you need to enclose it on parentheses:
let x = (+) a b // same thing as a + b.
And when I say "as any other function", I totally mean it:
let f = (+)
let x = f a b // still same thing.
Next, let's look at partial function application. Consider this function:
let f x y = x + y
We can call it and get a number in return:
let a = f 5 6 // a = 11
But we can also "almost" call it by supplying only one of two arguments:
let a = f 5 // a is a function
let b = a 6 // b = 11
The result of such "almost call" (technically called "partial application") is another function that still expects the remaining arguments.
And now, let's combine the two:
let a = (+) 5 // a is a function
let b = a 6 // b = 11
In general, one can write the following equivalency:
(+) x === fun y -> x + y
Or, similarly, for your specific case:
(>=) first === fun y -> first >= y

replacing a value withgsub in lua

function expandVars(tmpl,t)
return (tmpl:gsub('%$([%a ][%w ]+)', t)) end
local sentence = expandVars("The $adj $char1 looks at you and says, $name, you are $result", {adj="glorious", name="Jayant", result="the Overlord", char1="King"})
print(sentence)
The above code work only when I have ',' after the variable name like, in above sentence it work for $ name and $ result but not for $adj and $char1, Why is that ?
Problem:
Your pattern [%a ][%w ]+ means a letter or space, followed by at least one letter or number or space. Since regexp is greedy, it will try to match as large a sequence as possible, and the match will include the space:
function expandVars(tmpl,t)
return string.gsub(tmpl, '%$([%a ][%w ]+)', t)
end
local sentence = expandVars(
"$a1 $b and c $d e f ",
{["a1 "]="(match is 'a1 ')", ["b and c "]="(match is 'b and c ')", ["d e f "]="(match is 'd e f ')", }
)
This prints
(match is 'a1 ')(match is 'b and c ')(match is 'd e f ')
Solution:
The variable names must match keys from your table; you could accepts keys that have spaces and all sort of characters but then you are forcing the user to use [] in the table keys, as done above, this is not very nice :)
Better keep it to alphanumeric and underscore, with the constraint that it cannot start with a number. This means to be generic you want a letter (%a), followed by any number of (including none) (* rather than +) of alphanumeric and underscore [%w_]:
function expandVars(tmpl,t)
return string.gsub(tmpl, '%$(%a[%w_]*)', t)
end
local sentence = expandVars(
"$a $b1 and c $d_2 e f ",
{a="(match is 'a')", b1="(match is 'b1')", d_2="(match is 'd_2')", }
)
print(sentence)
This prints
(match is 'a') (match is 'b1') and c (match is 'd_2') e f; non-matchable: $_a $1a b
which shows how the leading underscore and leading digit were not accepted.

Pattern matching by function call

F# assigns function arguments via pattern matching. This is why
// ok: pattern matching of tuples upon function call
let g (a,b) = a + b
g (7,4)
works: The tuple is matched with (a,b) and a and b are available directly inside f.
Doing the same with discriminated unions would be equally beneficial, but I cannot get it to done:
// error: same with discriminated unions
type A =
| X of int * int
| Y of string
let f A.X(a, b) = a + b // Error: Successive patterns
// should be separated by spaces or tupled
// EDIT, integrating the answer:
let f (A.X(a, b)) = a + b // correct
f (A.X(7, 4))
Is pattern matching as part of the function call limited to tuples? Is there a way to do it with discriminated unions?
You need extra parens:
let f (A.X(a, b)) = a + b

How should I declare pipelines?

Does it matter how I declare a pipeline? I know of three ways:
let hello name = "Hello " + name + "!"
let solution1 = hello <| "Homer"
let solution2 = "Homer" |> hello
Which would you choose? solution1 or solution2 - and why?
As mentioned the pipe-forward operator |> helps with function composition and type inference. It allows you to rearrange the parameters of a function so that you can put the last parameter of a function first. This enables a chaining of functions that is very readable (similar to LINQ in C#). Your example doesn't show the power of this - it really shines when you have a transformation "pipeline" set up for several functions in a row.
Using |> chaining you could write:
let createPerson n =
if n = 1 then "Homer" else "Someone else"
let hello name = "Hello " + name + "!"
let solution2 =
1
|> createPerson
|> hello
|> printf "%s"
The benefit of the pipe-backward operator <| is that it changes operator precedence so it can save you a lot of brackets: Function arguments are normally evaluated left to right, using <| you don't need the brackets if you want to pass the result of one function to another function - your example doesn't really take advantage of this.
These would be equivalent:
let createPerson n =
if n = 1 then "Homer" else "Someone else"
let hello name = "Hello " + name + "!"
let solution3 = hello <| createPerson 1
let solution4 = hello (createPerson 1)
F# reads from top-to-bottom, left-to-right. For this reason, the |> operator is used much more than <| as it helps out type inference.

Resources