Compiler Friendly Tail Recursion + Tail Recursion Checking in ATS - tail-call-optimization

What is the best way to check that a function has been tail call optimized in ATS? (So far I have been running "top" to see if memory usage is constant)
As a follow up: Say you have a complex tail-recursive function that the compiler has failed to TCO, is there a way to re-write it in a more compiler friendly way? Or in such a way to force the compiler to attempt TCO?

There is a peculiar way to do it in ATS2.
Say you have
fnx foo(...) = bar(...)
and bar(...) = ...bar...
If the body of bar contains a non-tail-recursive call to bar, then
the C compiler is to complain with an error message.
Things become a lot more challenging when (linear) streams are involved. A seemingly non-tail-recursive function can run without concern of stack overflow because it essentially saves its stack on heap (and then frees it): This is a place where ATS really shines!

Related

Why does Erlang if statement support only specific functions in its guard?

Why does the Erlang if statement support only specific functions in its guard?
i.e -
ok(A) ->
if
whereis(abc)=:=undefined ->
register(abc,A);
true -> exit(already_registered)
end.
In this case we get an "illegal guard" error.
What would be the best practice to use function's return values as conditions?
Coming from other programming languages, Erlang's if seems strangely restrictive, and in fact, isn't used very much, with most people opting to use case instead. The distinction between the two is that while case can test any expression, if can only use valid Guard Expressions.
As explained in the above link, Guard Expressions are limited to known functions that are guaranteed to be free of side-effects. There are a number of reasons for this, most of which boil down to code predictability and inspectability. For instance, since matching is done top-down, guard expressions that don't match will be executed until one is found that does. If those expressions had side-effects, it could easily lead to unpredictable and confusing outcomes during debugging. While you can still accomplish that with case expressions, if you see an if you can know there are no side effects being introduced in the test without needing to check.
One last, but important thing, is that guards have to terminate. If they did not, the reduction of a function call could go on forever, and as the scheduler is based around reductions, that would be very bad indeed, with little to go on when things went badly.
As a counter-example, you can starve the scheduler in Go for exactly this reason. Because Go's scheduler (like all micro-process schedulers) is co-operatively multitasked, it has to wait for a goroutine to yield before it can schedule another one. Much like in Erlang, it waits for a function to finish what it's currently doing before it can move on. The difference is that Erlang has no loop-alike. To accomplish looping, you recurse, which necessitates a function call/reduction, and allows a point for the scheduler to intervene. In Go, you have C-style loops, which do not require a function call in their body, so code akin to for { i = i+1 } will starve the scheduler. Not that such loops without function calls in their body are super-common, but this issue does exist.
On the contrary, in Erlang it's extremely difficult to do something like this without setting out to do so explicitly. But if guards contained code that didn't terminate, it would become trivial.
Check this question: About the usage of "if" in Erlang language
In short:
Only a limited number of functions are allowed in guard sequences, and whereis is not one of them
Use case instead.

What reasoning lead to `Sequence expression containing recursive definition is compiled incorrectly`

The question Stack overflow despite tail call position but only in 64-bit lead to uncovering a bug in the F# compiler.
After reading the answer I am curious as to the reasoning that lead to finding the bug as I would like to better improve my skills at resolving problems and understanding TCO.
My reasoning was something like this:
Talking about "tail calls" when looking at a computation expression can be misleading - in general there really isn't such a thing (see e.g. this other answer for one related discussion, though not related to sequence expressions).
So calling gauss will not itself ever overflow the stack, only the code iterating through it could make it overflow. But once I saw that the calling code was just something like Seq.nth, that meant that there was almost certainly a compiler bug, because the pattern of "tail yield!ing" in a sequence expression is supposed to get optimized (not sure if this is part of the spec, but I think it's fairly well known). So then it was just a case of seeing what parts of the initial repro were necessary.
Replacing loop in the original code with a non-recursive definition made the repro stop failing, as did removing the pattern match. I didn't find looking at the IL helpful (there's so much compiler-generated machinery involved in the compilation of sequence expressions), I just tried minimizing the repro at the source level and empirically testing the behavior.

Erlang: Will adding type spec to code make dialyzer more effective?

I have a project that doesn't have -spec or -type in code, currently dialyzer can find some warnings, most of them are in machine generated codes.
Will adding type specs to code make dialyzer find more errors?
Off the topic, is there any tool to check whether the specs been violated or not?
Adding typespecs will dramatically improve the accuracy of Dialyzer.
Because Erlang is a dynamic language Dialyzer must default to a rather broad interpretation of types unless you give it hints to narrow the "success" typing that it will go by. Think of it like giving Dialyzer a filter by which it can transform a set of possible successes to a subset of explicit types that should ever work.
This is not the same as Haskell, where the default assumption is failure and all code must be written with successful typing to be compiled at all -- Dialyzer must default to assume success unless it knows for sure that a type is going to fail.
Typespecs are the main part of this, but Dialyzer also checks guards, so a function like
increment(A) -> A + 1.
Is not the same as
increment(A) when A > 100 -> A + 1.
Though both may be typed as
-spec increment(integer()) -> integer().
Most of the time you only care about integer values being integer(), pos_integer(), neg_integer(), or non_neg_integer(), but occasionally you need an arbitrary range bounded only on one side -- and the type language has no way to represent this currently (though personally I would like to see a declaration of 100..infinity work as expected).
The unbounded-range of when A > 100 requires a guard, but a bounded range like when A > 100 and A < 201 could be represented in the typespec alone:
-spec increment(101..200) -> pos_integer().
increment(A) -> %stuff.
Guards are fast with the exception of calling length/1 (which you should probably never actually need in a guard), so don't worry with the performance overhead until you actually know and can demonstrate that you have a performance problem that comes from guards. Using guards and typespecs to constrain Dialyzer is extremely useful. It is also very useful as documentation for yourself and especially if you use edoc, as the typespec will be shown there, making APIs less mysterious and easy to toy with at a glance.
There is some interesting literature on the use of Dialyzer in existing codebases. A well-documented experience is here: Gradual Typing of Erlang Programs: A Wrangler Experience. (Unfortunately some of the other links I learned a lot from previously have disappeared or moved. (!.!) A careful read of the Wrangler paper, skimming over the User's Guide and man page, playing with Dialyzer, and some prior experience in a type system like Haskell's will more than prepare you for getting a lot of mileage out of Dialyzer, though.)
[On a side note, I've spoken with a few folks before about specifying "pure" functions that could be guaranteed as strongly typed either with a notation or by using a different definition syntax (maybe Prolog's :- instead of Erlang's ->... or something), but though that would be cool, and it is very possible even now to concentrate side-effects in a tiny part of the program and pass all results back in a tuple of {Results, SideEffectsTODO}, this is simply not a pressing need and Erlang works pretty darn well as-is. But Dialyzer is indeed very helpful for showing you where you've lost track of yourself!]

Check if function is declared recursive

Is it possible to check if a function was declared as recursive, i.e. with let rec?
I have a memoize function, but it doesn't work with arbitrary recursive functions. I would like to give an error if the user calls it with such a function. (Feel free to tell me if this is a bad way to deal with errors in F#)
F# code is compiled to .NET Common Intermediate Language. F# rec keyword is just a hint to F# compiler that makes identifier that is being defined available in the scope of the function. So I believe that there is no way to find out at runtime that the function is declared as recursive.
However you could use System.Diagnostic.StackTrace class (https://msdn.microsoft.com/en-us/library/system.diagnostics.stacktrace.aspx) to get and analyze stack frames at runtime. But accessing stack information have significant performance impact (I'm assuming that your memoization function is for speeding up program performance). Also stack information could be different in Debug and Release versions of your program as the compiler can make optimizations.
I don't think it can be done in a sensible and comprehensive manner.
You might want to reevaluate your problem though. I assume your function "doesn't work" with recursive functions, because it only memoizes the first call, and all the recursive calls go to the non-memoized function? Depending on your scenario, it may not be a bad thing.
Memoization trades off memory for speed. In a large system, you might not want to pay the cost of memoizing intermediate results if you only care about the final one. All those dictionaries add up over time.
If you however particularly care about memoizing a recursive function, you can explicitly structure it in a way that would lend itself to memoization. See linked threads.
So my answer would be that a memoize function shouldn't be concerned about recursion at all. If the user wants to memoize a recursive function, it falls to him to understand the trade-offs involved, and structure the function in a way that would allow intermediate calls to be memoized if that's what he requires.

Pattern Matching, F# vs Erlang

In Erlang, you are encouraged not to match patterns that you do not actually handle. For example:
case (anint rem 10) of
1 -> {ok, 10}
9 -> {ok, 25}
end;
is a style that is encouraged, with other possible results resulting in a badmatch result. This is consistant with the "let it crash" philosophy in Erlang.
On the other hand, F# would issue an "incomplete pattern matching" in the equivalent F# code, like here.
The question: why wouldn't F# remove the warning, effectively by augmenting every pattern matching with a statement equivalent to
|_ -> failwith "badmatch"
and use the "let it crash" philosophy?
Edit: Two interesting answers so far: either to avoid bugs that are likely when not handling all cases of an algebraic datatype; or because of the .Net platform. One way to find out which is to check OCaml. So, what is the default behaviour in OCaml?
Edit: To remove misunderstanding by .Net people who have no background in Erlang. The point of the Erlang philosophy is not to produce bad code that always crashes. Let it crash means let some other process fix the error. Instead of writing the function so that it can handle all possible cases, let the caller (for example) handle the bad cases which are thrown automatically. For those with Java background, it is like the difference between having a language with checked exceptions which must declare everything it will possibly return with every possible exception, and having a language in which functions may raise exceptions that are not explicitly declared.
F# (and other languages with pattern matching, like Haskell and O'Caml) does implicitly add a case that throws an exception.
In my opinion the most valuable reason for having complete pattern matches and paying attention to the warning, is that it makes it easy to refactor by extending your datatype, because the compiler will then warn you about code you haven't yet updated with the new case.
On the other hand, sometimes there genuinely are cases that should be left out, and then it's annoying to have to put in a catch-all case with what is often a poor error message. So it's a trade-off.
In answer to your edit, this is also a warning by default in O'Caml (and in Haskell with -Wall).
In most cases, particularly with algebraic datatypes, forgetting a case is likely to be an accident and not an intentional decision to ignore a case. In strongly typed functional languages, I think that most functions will be total, and should therefore handle every case. Even for partial functions, it's often ideal to throw a specific exception rather than to use a generic pattern matching failure (e.g. List.head throws an ArgumentException when given an empty list).
Thus, I think that it generally makes sense for the compiler to warn the developer. If you don't like this behavior, you can either add a catch-all pattern which itself throws an exception, or turn off or ignore the warning.
why wouldn't F# remove the warning
Interesting that you would ask this. Silently injecting sources of run-time error is absolutely against the philosophy behind F# and its relatives. It is considered to be a grotesque abomination. This family of languages are all about static checking, to the extent that the type system was fundamentally designed to facilitate exactly these kinds of static checks.
This stark difference in philosophy is precisely why F# and Python are so rarely compared and contrasted. "Never the twain shall meet" as they say.
So, what is the default behaviour in OCaml?
Same as F#: exhaustiveness and redundancy of pattern matches is checked at compile time and a warning is issued if a match is found to be suspect. Idiomatic style is also the same: you are expected to write your code such that these warnings do not appear.
This behaviour has nothing to do with .NET and, in fact, this functionality (from OCaml) was only implemented properly in F# quite recently.
For example, if you use a pattern in a let binding to extract the first element of a list because you know the list will always have at least one element:
let x::_ = myList
In this family of languages, that is almost always indicative of a design flaw. The correct solution is to represent your non-empty list using a type that makes it impossible to represent the empty list. Static type checking then proves that your list cannot be empty and, therefore, guarantees that this source of run-time errors has been completely eliminated from your code.
For example, you can represent a non-empty list as a tuple containing the head and the tail list. Your pattern match then becomes:
let x, _ = myList
This is exhaustive so the compiler is happy and does not issue a warning. This code cannot go wrong at run-time.
I became an advocate of this technique back in 2004, when I refactored about 1kLOC of commercial OCaml code that had been a major source of run-time errors in an application (even though they were explicit in the form of catch-all match cases that raised exceptions). My refactoring removed all of the sources of run-time errors from most the code. The reliability of the entire application improved enormously. Moreover, we had wasted weeks hunting bugs via debugging but my refactoring was completed within 2 days. So this technique really does pay dividends in the real world.
Erlang cannot have exhaustive pattern matching because of dynamic types unless you have a catch-all in every, which is just silly. Ocaml, on the other hand, can. Ocaml also tries to push all issues that can be caught at compile-time to compile-time.
OCaml by default does warn you about incomplete matches. You can disable it by adding "p" to the "-w" flag. The idea with these warnings is that more often than not (at least in my experience) they are an indication of programmer error. Especially when all your patterns are really complex like Node (Node (Leaf 4) x) (Node y (Node Empty _)), it is easy to miss a case. When the programmer is sure that it cannot go wrong, explicitly adding a | _ -> assert false case is an unambiguous way to indicate that.
GHC by default turns off these warnings; but you can enable it with -fwarn-incomplete-patterns

Resources