Why is "do" allowed inside a function? - f#

I noticed that the following code compiles and works in VS 2013:
let f() =
do Console.WriteLine(41)
42
But when looking at the F# 3.0 specification I can't find any mention of do being used this way. As far as I can tell, do can have the following uses:
As a part of loop (e.g. while expr do expr done), that's not the case here.
Inside computation expressions, e.g.:
seq {
for i in 1..2 do
do Console.WriteLine(i)
yield i * 2
}
That's not the case here either, f doesn't contain any computation expressions.
Though what confuses me here is that according to the specification, do should be followed by in. That in should be optional due to lightweight syntax, but adding it here causes a compile error (“Unexpected token 'in' or incomplete expression”).
Statement inside a module or class. This is also not the case here, the do is inside a function, not inside a module or a class.
I also noticed that with #light "off", the code doesn't compile (“Unexpected keyword 'do' in binding”), but I didn't find anything that would explain this in the section on lightweight syntax either.
Based on all this, I would assume that using do inside a function this way should not compile, but it does. Did I miss something in the specification? Or is this actually a bug in the compiler or in the specification?

From the documentation on MSDN:
A do binding is used to execute code without defining a function or value.
Even though the spec doesn't contain a comprehensive list of the places it is allowed, it is merely an expression asserted to be of type unit. Some examples:
if ((do ()); true) then ()
let x: unit = do ()
It is generally omitted. Each of the preceding examples are valid without do. Therefore, do serves only to assert that an expression is of type unit.

Going through the F# 3.0 specification expression syntax has do expr as a choice of class-function-or-value-defn (types) [Ch 8, A.2.5] and module-function-or-value-defn (modules) [Ch 10, A.2.1.1].
I don't actually see in the spec where function-defn can have more than one expression, as long all but the last one evaluate to unit -- or that all but the last expression is ignored in determining the functions return value.
So, it seems this is an oversight in the documentation.

Related

How to invoke Erlang function with variable?

4> abs(1).
1
5> X = abs.
abs
6> X(1).
** exception error: bad function abs
7> erlang:X(1).
1
8>
Is there any particular reason why I have to use the module name when I invoke a function with a variable? This isn't going to work for me because, well, for one thing it is just way too much syntactic garbage and makes my eyes bleed. For another thing, I plan on invoking functions out of a list, something like (off the top of my head):
[X(1) || X <- [abs, f1, f2, f3...]].
Attempting to tack on various module names here is going to make the verbosity go through the roof, when the whole point of what I am doing is to reduce verbosity.
EDIT: Look here: http://www.erlangpatterns.org/chain.html The guy has made some pipe-forward function. He is invoking functions the same way I want to above, but his code doesn't work when I try to use it. But from what I know, the guy is an experienced Erlang programmer - I saw him give some keynote or whatever at a conference (well I saw it online).
Did this kind of thing used to work but not anymore? Surely there is a way I can do what I want - invoke these functions without all the verbosity and boilerplate.
EDIT: If I am reading the documentation right, it seems to imply that my example at the top should work (section 8.6) http://erlang.org/doc/reference_manual/expressions.html
I know abs is an atom, not a function. [...] Why does it work when the module name is used?
The documentation explains that (slightly reorganized):
ExprM:ExprF(Expr1,...,ExprN)
each of ExprM and ExprF must be an atom or an expression that
evaluates to an atom. The function is said to be called by using the
fully qualified function name.
ExprF(Expr1,...,ExprN)
ExprF
must be an atom or evaluate to a fun.
If ExprF is an atom the function is said to be called by using the implicitly qualified function name.
When using fully qualified function names, Erlang expects atoms or expression that evaluates to atoms. In other words, you have to bind X to an atom: X = atom. That's exactly what you provide.
But in the second form, Erlang expects either an atom or an expression that evaluates to a function. Notice that last word. In other words, if you do not use fully qualified function name, you have to bind X to a function: X = fun module:function/arity.
In the expression X=abs, abs is not a function but an atom. If you want thus to define a function,you can do so:
D = fun erlang:abs/1.
or so:
X = fun(X)->abs(X) end.
Try:
X = fun(Number) -> abs(Number) end.
Updated:
After looking at the discussion more, it seems like you're wanting to apply multiple functions to some input.
There are two projects that I haven't used personally, but I've starred on Github that may be what you're looking for.
Both of these projects use parse transforms:
fun_chain https://github.com/sasa1977/fun_chain
pipeline https://github.com/stolen/pipeline
Pipeline is unique because it uses a special syntax:
Result = [fun1, mod2:fun2, fun3] (Arg1, Arg2).
Of course, it could also be possible to write your own function to do this using a list of {module, function} tuples and applying the function to the previous output until you get the result.

Why are redundant parenthesis not allowed in syntax definitions?

This syntax module is syntactically valid:
module mod1
syntax Empty =
;
And so is this one, which should be an equivalent grammar to the previous one:
module mod2
syntax Empty =
( )
;
(The resulting parser accepts only empty strings.)
Which means that you can make grammars such as this one:
module mod3
syntax EmptyOrKitchen =
( ) | "kitchen"
;
But, the following is not allowed (nested parenthesis):
module mod4
syntax Empty =
(( ))
;
I would have guessed that redundant parenthesis are allowed, since they are allowed in things like expressions, e.g. ((2)) + 2.
This problem came up when working with the data types for internal representation of rascal syntax definitions. The following code will create the same module as in the last example, namely mod4 (modulo some whitespace):
import Grammar;
import lang::rascal::format::Grammar;
str sm1 = definition2rascal(\definition("unknown_main",("the-module":\module("unknown",{},{},grammar({sort("Empty")},(sort("Empty"):prod(sort("Empty"),[
alt({seq([])})
],{})))))));
The problematic part of the data is on its own line - alt({seq([])}). If this code is changed to seq([]), then you get the same syntax module as mod2. If you further delete this whole expression, i.e. so that you get this:
str sm3 =
definition2rascal(\definition("unknown_main",("the-module":\module("unknown",{},{},grammar({sort("Empty")},(sort("Empty"):prod(sort("Empty"),[
], {})))))));
Then you get mod1.
So should such redundant parenthesis by printed by the definition2rascal(...) function? And should it matter with regards to making the resulting module valid or not?
Why they are not allowed is basically we wanted to see if we could do without. There is currently no priority relation between the symbol kinds, so in general there is no need to have a bracket syntax (like you do need to + and * in expressions).
Already the brackets have two different semantics, one () being the epsilon symbol and two (Sym1 Sym2 ...) being a nested sequence. This nested sequence is defined (syntactically) to expect at least two symbols. Now we could without ambiguity introduce a third semantics for the brackets with a single symbol or relax the requirement for sequence... But we reckoned it would be confusing that in one case you would get an extra layer in the resulting parse tree (sequence), while in the other case you would not (ignored superfluous bracket).
More detailed wise, the problem of printing seq([]) is not so much a problem of the meta syntax but rather that the backing abstract notation is more relaxed than the concrete notation (i.e. it is a bigger language or an over-approximation). The parser generator will generate a working parser for seq([]). But, there is no Rascal notation for an empty sequence and I guess the pretty printer should throw an exception.

High-precedence application expressions as arguments

A high precedence application expression is one in which an identifier is immediately following by a left paren without intervening whitespace, e.g., f(g). Parentheses are required when passing these as function arguments: func (f(g)).
Section 15.2 of the spec states the grammar and precedence rules allow the unparenthesized form -- func f(g) -- but an additional check prevents this.
Why is this intentionally prohibited? It would obviate the need for excessive parentheses and piping, and generally make the code much cleaner.
A common example is
raise <| IndexOutOfRangeException()
or
raise (IndexOutOfRangeException())
could become simply
raise IndexOutOfRangeException()
I agree that the need for writing the additional parentheses is a bit annoying. I think that the main reason why it is not allowed to omit them is that adding a whitespace would then change the meaning of your code in quite a significant way:
// Call 'foo' with the result of 'bar()' as an argument
foo bar()
// Call 'foo' with 'bar' as the first argument and '()' as the second
foo bar ()
There are still some rough edges where adding parens changes the evaluation (see this form post), but that "just" changes the evaluation order. This would change the meaning of your code!

When do you put double semicolons in F#?

This is a stupid question. I've been reading a couple books on F# and can't find anything that explains when you put ;; after a statement, nor can I find a pattern in the reading. When do you end a statement with double semi-colons?
In the non-interactive F# code that's not supposed to be compatible with OCaml, you shouldn't need to ever need double semicolon. In the OCaml compatible mode, you would use it at the end of a top-level function declaration (In the recent versions, you can switch to this mode by using files with .ml extension or by adding #light "off" to the top).
If you're using the command-line fsi.exe tool or F# Interactive in Visual Studio then you'd use ;; to end the current input for F#.
When I'm posting code samples here at StackOverflow (and in the code samples from my book), I use ;; in the listing when I also want to show the result of evaluating the expression in F# interactive:
Listing from F# interactive
> "Hello" + " world!";;
val it : string = "Hello world!"
> 1 + 2;;
val it : int = 3
Standard F# source code
let n = 1 + 2
printf "Hello world!"
Sometimes it is also useful to show the output as part of the listing, so I find this notation quite useful, but I never explained it anywhere, so it's great that you asked!
Are you talking about F# proper or about running F# functions in the F# Interactive? In F# Interactive ;; forces execution of the code just entered. other than that ;; does not have any special meaning that I know of
In F#, the only place ;; is required is to end expressions in the interactive mode.
;; is left over from the transition from OCaml, where in turn it is left over from Caml Light. Originally ;; was used to end top-level "phrases"--that is, let, type, etc. OCaml made ;; optional since the typical module consists of a series of let statements with maybe one statement at the end to call the main function. If you deviate from this pattern, you need to separate the statements with ;;. Unfortunately, in OCaml, when ;; is optional versus required is hard to learn.
However, F# introduces two relevant modifications to OCaml syntax: indentation and do. Top-level statements have to go inside a do block, and indentation is required for blocks, so F# always knows that each top-level statement begin with do and an indent and ends with an outdent. No more ;; required.
Overall, all you need to know is that [O']Caml's syntax sucks, and F# fixes a lot of its problems, but maintains a lot of confusing backward compatibility. (I believe that F# can still compile a lot of OCaml code.)
Note: This answer was based on my experience with OCaml and the link Adam Gent posted (which is unfortunately not very enlightening unless you know OCaml).
Symbol and Operator Reference (F#)
http://msdn.microsoft.com/en-us/library/dd233228(v=VS.100).aspx
Semi Colon:
•Separates expressions (used mostly in verbose syntax).
•Separates elements of a list.
•Separates fields of a record.
Double Semi Colon:
http://www.ffconsultancy.com/products/fsharp_journal/free/introduction.html
Articles in The F#.NET Journal quote F# code as it would appear in an interactive session. Specifically, the interactive session provides a > prompt, requires a double semicolon ;; identifier at the end of a code snippet to force evaluation, and returns the names (if any) and types of resulting definitions and values.
I suspect that you have seen F# code written when #light syntax wasn't enabled by default (#light syntax is on by default for the May 2009 CTP and later ones as well as for Visual Studio 2010) and then ;; means the end of a function declaration.
So what is #light syntax? It comes with the #light declaration:
The #light declaration makes
whitespace significant. Allowing the
developer to omit certain keywords
such as in, ;, ;;, begin, and end.
Here's a code written without #light syntax:
let halfWay a b =
let dif = b - a in
let mid = dif / 2 in
mid + a;;
and becomes with light syntax:
#light
let halfWay a b =
let dif = b - a
let mid = dif / 2
mid + a
As said you can omit the #light declaration now (which should be the case if you're on a recent CTP or Visual Studio 2010).
See also this thread if you want know more on the #light syntax: F# - Should I learn with or without #light?
The double semi-colon is used to mark the end of a block of code that is ready for evaluation in F# interactive when you are typing directly into the interactive session. For example, when using it as a calculator.
This is rarely seen in F# because you typically write code into a script file, highlight it and use ALT+ENTER to have it evaluated, with Visual Studio effectively injecting the ;; at the end for you.
OCaml is the same.
Literature often quotes code written as it would appear if it had been typed into an interactive session because this is a clear way to convey not only the code but also its inferred type. For example:
> [1; 2; 3];;
val it : int list = [1; 2; 3]
This means that you type the expression [1; 2; 3] into the interactive session followed by the ;; denoting the end of a block of code that is ready to be evaluated interactively and the compiler replies with val it : int list = [1; 2; 3] describing that the expression evaluated to a value of the type int list.
The double semicolon most likely comes from OCaml since that is what the language is based on.
See link text
Basically its for historical purposes and you need it for the evaluator (repl) if you use it.
There is no purpose for double semi-colons (outside of F# interactive). The semi-colon, according to MSDN:
Separates expressions (used mostly
in verbose syntax).
Separates
elements of a list.
Separates
fields of a record.
Therefore, in the first instance, ;; would be separating the expression before the first semi-colon from the empty expression after it but before the second semi-colon, and separating that empty expression from whatever came after the second semi-colon (just as in, say C# or C++).
In the instance of the list, I suspect you'd get an error for defining an empty list element.
With regards to the record, I suspect it would be similar to separating expressions, with the empty space between the semi-colons effectively being ignored.
F# interactive executes the entered F# on seeing a double semi-colon.
[Updated to cover F# interactive - courtesy of mfeingold)
The history of the double semicolon can be traced back to the beginnings of ML when semicolons were used as a separator in lists instead of commas. In this ICFP 2010 - Tribute to Robin Milner video around 50:15 Mike Gordon mentions:
There was a talk on F# where someone asked "Why is there double semicolon on the end of F# commands?" The reason is the separator in lists in the original ML is semicolons, so if you wanted a list 1;2;3; and put it on separate lines- if you ended a line with semicolon you were not ending the phrase, so using double semicolon meant the end of the expression. Then in Standard ML the separator for lists became comma, so that meant you could use single semicolons to end lists.

Point-free style with objects/records in F#

I'm getting stymied by the way "dot notation" works with objects and records when trying to program in a point-free functional style (which I think is a great, concise way to use a functional language that curries by default).
Is there an operator or function I'm missing that lets me do something like:
(.) object method instead of object.method?
(From what I was reading about the new ? operator, I think it works like this. Except it requires definition and gets into the whole dynamic binding thing, which I don't think I need.)
In other words, can I apply a method to its object as an argument like I would apply a normal function to its argument?
Short answer: no.
Longer answer: you can of course create let-bound functions in a module that call a method on a given type... For example in the code
let l = [1;2;3]
let h1 = l.Head
let h2 = List.hd l
there is a sense in which "List.hd" is the version of what you want for ".Head on a list". Or locally, you can always do e.g.
let AnotherWay = (fun (l:list<_>) -> l.Head)
let h3 = AnotherWay l
But there is nothing general, since there is no good way to 'name' an arbitrary instance method on a given type; 'AnotherWay' shows a way to "make a function out of the 'Head' property on a 'list<_>' object", but you need such boilerplate for every instance method you want to treat as a first-class function value.
I have suggested creating a language construct to generalize this:
With regards to language design
suggestions, what if
SomeType..Foo optArgs // note *two* dots
meant
fun (x : SomeType) -> x.Foo optArgs
?
In which case you could write
list<_>..Head
as a way to 'functionize' this instance property, but if we ever do anything in that arena in F#, it would be post-VS2010.
If I understand your question correctly, the answer is: no you can't. Dot (.) is not an operator in F#, it is built into the language, so can't be used as function.

Resources