I'm playing with F# and the compiler warns me if I don't use some result (same problem described here). Since F# even has the function "Ignore" for that, it seems that it's somewhat important, but I don't really understand why - why doesn't C# care about it, but F# does?
One fundamental difference between C# and F# is that in F# everything is an expression (as opposed to a mix of expressions and statements). This includes things that in C-style languages are statements, like control flow constructs.
When programming in a functional way, you want to have small pieces of referentially transparent code that you can compose together. The fact that everything is an expression plays right into that.
On the other hand, when you do something that gives you a value, and you just leave it there, you are going against that mindset. You are either doing it for some side-effect or you simply have a piece of left-over code somewhere. In either case it's fair game to warn you that you're doing something atypical.
F# discourages, but doesn't disallow side-effects, and lets you have (potentially side-effecting) expressions executed in a sequence, as long as the intermediate ones are of type unit. And this is what ignore does - takes an argument and returns unit.
In F#, most everything is an expression with a value.
If you neglect the value of an expression in F# by either failing to bind it or return it, then it feels like you're making a mistake. Ignoring the value of an expression is an indication that you're depending on the side-effect of an operation and in F# you should be eschewing side-effects.
Related
function A: Boolean;
function B: Boolean;
I (accidently) wrote this:
A or B;
Instead of that:
if not A then
B;
The compiler rejects the first form, I am curious why?
With short circuit evaluation they would both do the same thing, would they not?
Clarification: I was wondering why the language was not designed to allow my expression as a statement.
The first is an expression. Expressions are evaluated. Expressions have no visible side effects (like read or write a variable). Both operands of the expression are functions and those can have side effects, but in order to have side effects, a statement must be executed.
The second is a statement. It compares the result of an expression and based on the evaluation calls another function.
The confusing part, is that in this case, Delphi allows us to disregard the result of a function and execute it as a function. So you expect the same for A or B. But that is not allowed. Which is great because the behaviour is ambiguous. For example, if yo have lazy evaluation enabled. And A evaluates to true, is B called yes or no.
Simply, because the compiler is expecting a statement and the expression that you have provided is not a statement.
Consult the documentation and you will find a list of valid statements. Your expression cannot be found in that list.
You asked in the (now deleted) comments why the language designers elected not to make such an expression count as a statement. But that question implies purpose where there may have been none. It's perfectly plausible that the designers did not decide not to do this. Rather they never considering doing it in the first place. Languages are generally designed to solve specific problems. It's perfectly plausible that the designers simply never considered treating such expressions as statements.
The first form is an expression which evaluates to a Boolean value, not a statement.
At its heart, Delphi is Pascal. The Pascal language was designed by Nicklaus Wirth and published in 1968. My copy of the User Manual and Report is from 1978. It was designed with two purposes in mind, as a teaching language and as one that was easy to implement on any given machine. In this he was spectacularly successful.
Wirth was intimately familiar with other languages of the time (including Fortran, Cobol and particularly Algol) and made a series of careful choices with particular purposes in mind. In particular, he carefully separated the concept of 'actions' from 'values'. The 'actions' in Pascal are the statements in the language, including procedure call. The 'values' include function calls. In this and some other respects the language is quite similar to Algol.
The syntax for declaring and using actions and values are carefully kept quite separate. The language and the libraries provided with it do not in general have 'side effects' as such. Procedures do things and expressions calculate values. For example, 'read' is a procedure, not a function, because it retrieves a value and advances through the file, but 'eof' is a function.
The mass market version of Pascal was created by Borland in the mid 1980s and successively became Turbo Pascal for Windows and then Delphi. The language has changed a lot and not all of it is as pure as Wirth designed it. This is one feature that has survived.
Incidentally, Pascal did not have short-circuit evaluation. It had heap memory and sets, but no objects. They came later.
I'm learning F# (new to functional programming in general though used functional aspects of C# for years but let's face it, that's pretty different) and one of the things that I've read is that the F# compiler identifies tail recursion and compiles it into a while loop (see http://thevalerios.net/matt/2009/01/recursion-in-f-and-the-tail-recursion-police/).
What I don't understand is why you would write a recursive function instead of a while loop if that's what it's going to turn into anyway. Especially considering that you need to do some extra work to make your function recursive.
I have a feeling someone might say that the while loop is not particularly functional and you want to act all functional and whatnot so you use recursion but then why is it sufficient for the compiler to turn it into a while loop?
Can someone explain this to me?
You could use the same argument for any transformation that the compiler performs. For instance, when you're using C#, do you ever use lambda expressions or anonymous delegates? If the compiler is just going to turn those into classes and (non-anonymous) delegates, then why not just use those constructions yourself? Likewise, do you ever use iterator blocks? If the compiler is just going to turn those into state machines which explicitly implement IEnumerable<T>, then why not just write that code yourself? Or if the C# compiler is just going to emit IL anyway, why bother writing C# instead of IL in the first place? And so on.
One obvious answer to all of these questions is that we want to write code which allows us to express ourselves clearly. Likewise, there are many algorithms which are naturally recursive, and so writing recursive functions will often lead to a clear expression of those algorithms. In particular, it is arguably easier to reason about the termination of a recursive algorithm than a while loop in many cases (e.g. is there a clear base case, and does each recursive call make the problem "smaller"?).
However, since we're writing code and not mathematics papers, it's also nice to have software which meets certain real-world performance criteria (such as the ability to handle large inputs without overflowing the stack). Therefore, the fact that tail recursion is converted into the equivalent of while loops is critical for being able to use recursive formulations of algorithms.
A recursive function is often the most natural way to work with certain data structures (such as trees and F# lists). If the compiler wants to transform my natural, intuitive code into an awkward while loop for performance reasons that's fine, but why would I want to write that myself?
Also, Brian's answer to a related question is relevant here. Higher-order functions can often replace both loops and recursive functions in your code.
The fact that F# performs tail optimization is just an implementation detail that allows you to use tail recursion with the same efficiency (and no fear of a stack overflow) as a while loop. But it is just that - an implementation detail - on the surface your algorithm is still recursive and is structured that way, which for many algorithms is the most logical, functional way to represent it.
The same applies to some of the list handling internals as well in F# - internally mutation is used for a more efficient implementation of list manipulation, but this fact is hidden from the programmer.
What it comes down to is how the language allows you to describe and implement your algorithm, not what mechanics are used under the hood to make it happen.
A while loop is imperative by its nature. Most of the time, when using while loops, you will find yourself writing code like this:
let mutable x = ...
...
while someCond do
...
x <- ...
This pattern is common in imperative languages like C, C++ or C#, but not so common in functional languages.
As the other posters have said some data structures, more exactly recursive data structures, lend themselves to recursive processing. Since the most common data structure in functional languages is by far the singly linked list, solving problems by using lists and recursive functions is a common practice.
Another argument in favor of recursive solutions is the tight relation between recursion and induction. Using a recursive solution allows the programmer to think about the problem inductively, which arguably helps in solving it.
Again, as other posters said, the fact that the compiler optimizes tail-recursive functions (obviously, not all functions can benefit from tail-call optimization) is an implementation detail which lets your recursive algorithm run in constant space.
In Erlang, you are encouraged not to match patterns that you do not actually handle. For example:
case (anint rem 10) of
1 -> {ok, 10}
9 -> {ok, 25}
end;
is a style that is encouraged, with other possible results resulting in a badmatch result. This is consistant with the "let it crash" philosophy in Erlang.
On the other hand, F# would issue an "incomplete pattern matching" in the equivalent F# code, like here.
The question: why wouldn't F# remove the warning, effectively by augmenting every pattern matching with a statement equivalent to
|_ -> failwith "badmatch"
and use the "let it crash" philosophy?
Edit: Two interesting answers so far: either to avoid bugs that are likely when not handling all cases of an algebraic datatype; or because of the .Net platform. One way to find out which is to check OCaml. So, what is the default behaviour in OCaml?
Edit: To remove misunderstanding by .Net people who have no background in Erlang. The point of the Erlang philosophy is not to produce bad code that always crashes. Let it crash means let some other process fix the error. Instead of writing the function so that it can handle all possible cases, let the caller (for example) handle the bad cases which are thrown automatically. For those with Java background, it is like the difference between having a language with checked exceptions which must declare everything it will possibly return with every possible exception, and having a language in which functions may raise exceptions that are not explicitly declared.
F# (and other languages with pattern matching, like Haskell and O'Caml) does implicitly add a case that throws an exception.
In my opinion the most valuable reason for having complete pattern matches and paying attention to the warning, is that it makes it easy to refactor by extending your datatype, because the compiler will then warn you about code you haven't yet updated with the new case.
On the other hand, sometimes there genuinely are cases that should be left out, and then it's annoying to have to put in a catch-all case with what is often a poor error message. So it's a trade-off.
In answer to your edit, this is also a warning by default in O'Caml (and in Haskell with -Wall).
In most cases, particularly with algebraic datatypes, forgetting a case is likely to be an accident and not an intentional decision to ignore a case. In strongly typed functional languages, I think that most functions will be total, and should therefore handle every case. Even for partial functions, it's often ideal to throw a specific exception rather than to use a generic pattern matching failure (e.g. List.head throws an ArgumentException when given an empty list).
Thus, I think that it generally makes sense for the compiler to warn the developer. If you don't like this behavior, you can either add a catch-all pattern which itself throws an exception, or turn off or ignore the warning.
why wouldn't F# remove the warning
Interesting that you would ask this. Silently injecting sources of run-time error is absolutely against the philosophy behind F# and its relatives. It is considered to be a grotesque abomination. This family of languages are all about static checking, to the extent that the type system was fundamentally designed to facilitate exactly these kinds of static checks.
This stark difference in philosophy is precisely why F# and Python are so rarely compared and contrasted. "Never the twain shall meet" as they say.
So, what is the default behaviour in OCaml?
Same as F#: exhaustiveness and redundancy of pattern matches is checked at compile time and a warning is issued if a match is found to be suspect. Idiomatic style is also the same: you are expected to write your code such that these warnings do not appear.
This behaviour has nothing to do with .NET and, in fact, this functionality (from OCaml) was only implemented properly in F# quite recently.
For example, if you use a pattern in a let binding to extract the first element of a list because you know the list will always have at least one element:
let x::_ = myList
In this family of languages, that is almost always indicative of a design flaw. The correct solution is to represent your non-empty list using a type that makes it impossible to represent the empty list. Static type checking then proves that your list cannot be empty and, therefore, guarantees that this source of run-time errors has been completely eliminated from your code.
For example, you can represent a non-empty list as a tuple containing the head and the tail list. Your pattern match then becomes:
let x, _ = myList
This is exhaustive so the compiler is happy and does not issue a warning. This code cannot go wrong at run-time.
I became an advocate of this technique back in 2004, when I refactored about 1kLOC of commercial OCaml code that had been a major source of run-time errors in an application (even though they were explicit in the form of catch-all match cases that raised exceptions). My refactoring removed all of the sources of run-time errors from most the code. The reliability of the entire application improved enormously. Moreover, we had wasted weeks hunting bugs via debugging but my refactoring was completed within 2 days. So this technique really does pay dividends in the real world.
Erlang cannot have exhaustive pattern matching because of dynamic types unless you have a catch-all in every, which is just silly. Ocaml, on the other hand, can. Ocaml also tries to push all issues that can be caught at compile-time to compile-time.
OCaml by default does warn you about incomplete matches. You can disable it by adding "p" to the "-w" flag. The idea with these warnings is that more often than not (at least in my experience) they are an indication of programmer error. Especially when all your patterns are really complex like Node (Node (Leaf 4) x) (Node y (Node Empty _)), it is easy to miss a case. When the programmer is sure that it cannot go wrong, explicitly adding a | _ -> assert false case is an unambiguous way to indicate that.
GHC by default turns off these warnings; but you can enable it with -fwarn-incomplete-patterns
Seems like it's inconsistent in the lists module. For example, split has the number as the first argument and the list as the second, but sublists has the list as the first argument and the len as the second argument.
OK, a little history as I remember it and some principles behind my style.
As Christian has said the libraries evolved and tended to get the argument order and feel from the impulses we were getting just then. So for example the reason why element/setelement have the argument order they do is because it matches the arg/3 predicate in Prolog; logical then but not now. Often we would have the thing being worked on first, but unfortunately not always. This is often a good choice as it allows "optional" arguments to be conveniently added to the end; for example string:substr/2/3. Functions with the thing as the last argument were often influenced by functional languages with currying, for example Haskell, where it is very easy to use currying and partial evaluation to build specific functions which can then be applied to the thing. This is very noticeable in the higher order functions in lists.
The only influence we didn't have was from the OO world. :-)
Usually we at least managed to be consistent within a module, but not always. See lists again. We did try to have some consistency, so the argument order in the higher order functions in dict/sets match those of the corresponding functions in lists.
The problem was also aggravated by the fact that we, especially me, had a rather cavalier attitude to libraries. I just did not see them as a selling point for the language, so I wasn't that worried about it. "If you want a library which does something then you just write it" was my motto. This meant that my libraries were structured, just not always with the same structure. :-) That was how many of the initial libraries came about.
This, of course, creates unnecessary confusion and breaks the law of least astonishment, but we have not been able to do anything about it. Any suggestions of revising the modules have always been met with a resounding "no".
My own personal style is a usually structured, though I don't know if it conforms to any written guidelines or standards.
I generally have the thing or things I am working on as the first arguments, or at least very close to the beginning; the order depends on what feels best. If there is a global state which is chained through the whole module, which there usually is, it is placed as the last argument and given a very descriptive name like St0, St1, ... (I belong to the church of short variable names). Arguments which are chained through functions (both input and output) I try to keep the same argument order as return order. This makes it much easier to see the structure of the code. Apart from that I try to group together arguments which belong together. Also, where possible, I try to preserve the same argument order throughout a whole module.
None of this is very revolutionary, but I find if you keep a consistent style then it is one less thing to worry about and it makes your code feel better and generally be more readable. Also I will actually rewrite code if the argument order feels wrong.
A small example which may help:
fubar({f,A0,B0}, Arg2, Ch0, Arg4, St0) ->
{A1,Ch1,St1} = foo(A0, Arg2, Ch0, St0),
{B1,Ch2,St2} = bar(B0, Arg4, Ch1, St1),
Res = baz(A1, B1),
{Res,Ch2,St2}.
Here Ch is a local chained through variable while St is a more global state. Check out the code on github for LFE, especially the compiler, if you want a longer example.
This became much longer than it should have been, sorry.
P.S. I used the word thing instead of object to avoid confusion about what I was talking.
No, there is no consistently-used idiom in the sense that you mean.
However, there are some useful relevant hints that apply especially when you're going to be making deeply recursive calls. For instance, keeping whichever arguments will remain unchanged during tail calls in the same order/position in the argument list allows the virtual machine to make some very nice optimizations.
Given a grammar and the attached action code, are there any standard solution for deducing what type each production needs to result in (and consequently, what type the invoking production should expect to get from it)?
I'm thinking of an OO program and action code that employs something like c#'s var syntax (but I'm not looking for something that is c# specific).
This would be fairly simple if it were not for function overloading and recursive grammars.
The issue arises with cases like this:
Foo ::=
Bar Baz { return Fig(Bar, Baz); }
Foo Pit { return Pop(Foo, Pit); } // typeof(foo) = fn(typeof(Foo))
If you are writing code in a functional language it is easy; standard Hindley-Milner type inference works great. Do not do this. In my EBNF parser generator (never released but source code available on request), which supports Icon, c, and Standard ML, I actually implemented the idea you are asking about for the Standard ML back end: all the types were inferred. The resulting grammars were nearly impossible to debug.
If you throw overloading into the mix, the results are only going to be harder to debug. (That's right! This just in! Harder than impossible! Greater than infinity! Past my bedtime!) If you really want to try it yourself you're welcome to my code. (You don't; there's a reason I never released it.)
The return value of a grammar action is really no different from a local variable, so you should be able to use C# type inference to do the job. See this paper for some insight into how C# type inference is implemented.
The standard way of doing type inference is the Hindley-Milner algorithm, but that will not handle overloading out-of-the-box.
Note that even parser generators for type-inferencing languages don't typically infer the types of grammar actions. For example, ocamlyacc requires type annotations. The Happy parser generator for Haskell can infer types, but seems to discourage the practice. This might indicate that inferring types in grammars is difficult, a bad idea, or both.
[UPDATE] Very much pwned by Norman Ramsey, who has the benefit of bitter experience.