I am really struggling to understand tail recursion in Erlang.
I have the following eunit test:
db_write_many_test() ->
Db = db:new(),
Db1 = db:write(francesco, london, Db),
Db2 = db:write(lelle, stockholm, Db1),
?assertEqual([{lelle, stockholm},{francesco, london}], Db2).
And here is my implementation:
-module(db) .
-include_lib("eunit/include/eunit.hrl").
-export([new/0,write/3]).
new() ->
[].
write(Key, Value, Database) ->
Record = {Key, Value},
[Record|append(Database)].
append([H|T]) ->
[H|append(T)];
append([]) ->
[].
Is my implementation tail recursive and if not, how can I make it so?
Thanks in advance
Your implementation is not tail recursive because append must hold onto the head of the list while computing the tail. In order for a function to be tail-recursive the return value must not rely on an value other than the what is returned from the function call.
you could rewrite it like so:
append(Acc, []) -> %% termination;
Acc;
append(Acc, [H|T]) ->
Acc2 = Acc ++ dosomethingto(H); %% maybe you meant this to be your write function?
append(Acc2, T); %% tail rercursive
Notice that all the work is finished once the tail recursive call occurs. So the append function can forget everthing in the function body and only needs to remember the values of the arguments it passes into the next call.
Also notice that I put the termination clause before the recursive clause. Erlang evaluates the clauses in order and since termination clauses are typically more specific the less specific recursive clauses will hide them thus preventing the function from ever returning, which is most likey not your desired behaviour.
Related
According to this previous answer
You could implement List.map like this:
let rec map project = function
| [] -> []
| head :: tail ->
project head :: map project tail ;;
but instead, it is implemented like this:
let rec map project = function
| [] -> []
| head :: tail ->
let result = project head in
result :: map project tail ;;
They say that it is done this way to make sure the projection function is called in the expected order in case it has side effects, e.g.
map print_int [1;2;3] ;;
should print 123, but the first implementation would print 321. However, when I test both of them myself in OCaml and F#, they produce exactly the same 123 result.
(Note that I am testing this in the OCaml and F# REPLs--Nick in the comments suggests this might be the cause of my inability to reproduce, but why?)
What am I misunderstanding? Can someone elaborate why they should produce different orders and how I can reproduce? This runs contrary to my previous understanding of OCaml code I've written in the past so this was surprising to me and I want to make sure not to repeat the mistake. When I read the two, I read it as exactly the same thing with an extraneous intermediary binding.
My only guess is that the order of expression evaluation using cons is right to left, but that seems very odd?
This is being done purely as research to better understand how OCaml executes code, I don't really need to create my own List.map for production code.
The point is that the order of function application in OCaml is unspecified, not that it will be in some specific undesired order.
When evaluating this expression:
project head :: map project tail
OCaml is allowed to evaluate project head first or it can evaluate map project tail first. Which one it chooses to do is unspecified. (In theory it would probably be admissible for the order to be different for different calls.) Since you want a specified order, you need to use the form with let.
The fact that the order is unspecified is documented in Section 6.7 of the OCaml manual. See the section Function application:
The order in which the expressions expr, argument1, …, argumentn are evaluated is not specified.
(The claim that the evaluation order is unspecified isn't something you can test. No number of cases of a particular order prove that that order is always going to be chosen.)
So when you have an implementation of map like this:
let rec map f = function
| [] -> []
| a::l -> f a :: map f l
none of the function applications (f a) within the map calls are guaranteed to be evaluated sequentially in the order you'd expect. So when you try this:
map print_int [1;2;3]
you get the output
321- : unit list = [(); (); ()]
since by the time those function applications weren't executed in a specific order.
Now when you implement the map like this:
let rec map f = function
| [] -> []
| a::l -> let r = f a in r :: map f l
you're forcing the function applications to be executed in the order you're expecting because you explicitly make a call to evaluate let r = f a.
So now when you try:
map print_int [1;2;3]
you will get
123- : unit list = [(); (); ()]
because you've explicitly made an effort to evaluate the function applications in order.
I'm trying to do a process on items in a sorted set in erlang, I call ZRANGE KEY 0 -1 WITHSCORES with eredis, the problem is it returns something like [<<"item1">>, <<"100">>, <<"item2">>, <<"200">>]. How can I run a function f on these items efficiently so that these calls occur: f(<<"item1">>, <<"100">>), f(<<"item2">>, <<"200">>)?
I solved it with something like this
f([X,Y|T]) -> [do_the_job(X,Y)|f(T)];
f([]) -> [].
then calling:
f(List).
Is there a more efficient way for doing so?
An optimized way is using tail-recursion. You can pass your list into do/1 function and it generates an empty list for storing the result of applying f/2 function on each two head items of the given list and then return the results:
do(List) ->
do(List, []).
do([X,Y | Tail], Acc) ->
do(Tail, [f(X, Y) | Acc]);
do([], Acc) ->
lists:reverse(Acc).
f(X, Y) ->
{X, Y}.
A note from Erlang documentation about tail-recursive efficiency:
In most cases, a recursive function uses more words on the stack for each recursion than the number of words a tail-recursive would allocate on the heap. As more memory is used, the garbage collector is invoked more frequently, and it has more work traversing the stack.
Im trying to check with a case if a list is empty rather then recursivly catching the pattern when it is, is this the right way to go in Erlang or am i just walking down the wrong path and pattern matching is the best way to catch if a list has been emptied or not?
calculate([Head|Tail], pi, x, y) ->
...calculations of Head being sent to a list...
case Tail == [] of
false ->
calculate(Tail, pi, x, y)
end.
or should i just pattern match on calculate if the list is empty?
Error in your code
General practice is to use function clause with pattern match. It works just as case, and it is considered to much more readable. And it fixes one error you have in your implementation:
First of all your code could be rewritten in this manner.
calculate([Head|Tail], pi, x, y) ->
%% ... actual calculations ...
calculate( Tail, pi, x, y);
calculate([], pi, x, y) ->
%% you need to return something here, but you don't
As you can see, one of clauses do not return anything, which is not allowed in Erlang (fail during compilation). Your implementation does exactly same thing. case just like anything in Erlang must return some value (and since it is lase statement in your function this value will be returned from function). And since case needs to return something, it needs to match on one of it's clauses. It most cases, since Tail == [] will return false it will not be a problem. But at last recursive call, when Tail is empty list, Tail == [] will return true and case will not match to anything. And in Erlang this will cause (throw, or exit to be exact) case_clause error. So your implementation will always fail.
To fix it you need to make sure you always have something matching in you case, like this
case Tail == [] of
false ->
calculate(Tail, pi, x, y)
true ->
%% return something
end.
Or it could be written like this
case Tail of
[] ->
%% return something sane
_ ->
calculate(Tail, pi, x, y)
end.
where _ will match to anything, and will work somewhat like else is some other languages. And finally it could be written with function clauses, just like I showed before, but with this sane value returned.
EDIT
returning a value
If you look closer at our code wright now we are returning only one value; the one from last recursive call (the one I called "sane"). If you would like to take under account all calculations from all recursive calls you need to accumulate them somehow. And to do this we will use Acc variable
calculate([Head|Tail], pi, x, y, Acc) ->
%% ... actual calculations with result assigned to Res variable ...
NewAcc = [Res | Acc]
calculate(Tail, pi, x, y, NewAcc);
calculate([], pi, x, y, Acc) ->
Acc.
In each recursive call we add our calculations Res to accumulator Acc, and send this updated list to next level of recursion. And finally, when our input list is empty (we processed all data) we just return whole accumulator. All we need to do, is make sure, that when calculate is being first called, it is called with empty list as Acc. This could be done by new (somewhat old) function
calculate(List, pi, x, y) ->
calculate(List, pi, x, y, _Acc = []).
Now we can export calculate/4 and keep calculate/5 private.
Pattern match. Its the Right Thing.
It is also more efficient. It also prevents you from developing a habit of just accepting any sort of variables up front, going partway through your function and discovering that what you've received isn't even a list (oops!). Pattern matching (and using certain types of guards) are also central to the way Dialyzer checks success typings -- which may or not matter to you right now, but certainly will once you start working on the sort of software that has customers.
Most importantly, though, learning to take advantage of pattern matching teaches you to write smaller functions. Writing a huge function with a bajillion parameters that can do everything is certainly possible, and even common in many other languages, but pattern matching will illustrate to you why this is a bad idea as soon as you start writing your match cases. That will help you in ways I can't even begin to describe; it will seep into how you think about programs without you appreciating it at first; it will cut the clutter out of your nested conditions (because they won't exist); it will teach you to stop writing argument error checking code everywhere.
add a clause with empty list, and if not possible, one with a single element list:
func([H],P,X,Y) ->
do_something(H,P,X,Y);
func([H|T],P,X,Y) ->
do_something(H,P,X,Y),
func(T,P,X,Y).
Note that this will fail with an empty input list.
Look also if you can use one of the functions lists:map/2 or lists:foldl/3 or list comprehension...
I'm a completely new to erlang. As an exercise to learn the language, I'm trying to implement the function sublist using tail recursion and without using reverse. Here's the function that I took from this site http://learnyousomeerlang.com/recursion:
tail_sublist(L, N) -> reverse(tail_sublist(L, N, [])).
tail_sublist(_, 0, SubList) -> SubList;
tail_sublist([], _, SubList) -> SubList;
tail_sublist([H|T], N, SubList) when N > 0 ->
tail_sublist(T, N-1, [H|SubList]).
It seems the use of reverse in erlang is very frequent.
In Mozart/Oz, it's very easy to create such the function using unbound variables:
proc {Sublist Xs N R}
if N>0 then
case Xs
of nil then
R = nil
[] X|Xr then
Unbound
in
R = X|Unbound
{Sublist Xr N-1 Unbound}
end
else
R=nil
end
end
Is it possible to create a similar code in erlang? If not, why?
Edit:
I want to clarify something about the question. The function in Oz doesn't use any auxiliary function (no append, no reverse, no anything external or BIF). It's also built using tail recursion.
When I ask if it's possible to create something similar in erlang, I'm asking if it's possible to implement a function or set of functions in erlang using tail recursion, and iterating over the initial list only once.
At this point, after reading your comments and answers, I'm doubtful that it can be done, because erlang doesn't seem to support unbound variables. It seems that all variables need to be assigned to value.
Short Version
No, you can't have a similar code in Erlang. The reason is because in Erlang variables are Single assignment variables.
Unbound Variables are simply not allowed in Erlang.
Long Version
I can't imagine a tail recursive function similar to the one you presenting above due to differences at paradigm level of the two languages you are trying to compare.
But nevertheless it also depends of what you mean by similar code.
So, correct me if I am wrong, the following
R = X|Unbound
{Sublist Xr N-1 Unbound}
Means that the attribution (R=X|Unbound) will not be executed until the recursive call returns the value of Unbound.
This to me looks a lot like the following:
sublist(_,0) -> [];
sublist([],_) -> [];
sublist([H|T],N)
when is_integer(N) ->
NewTail = sublist(T,N-1),
[H|NewTail].
%% or
%%sublist([H|T],N)
%% when is_integer(N) -> [H|sublist(T,N-1)].
But this code isn't tail recursive.
Here's a version that uses appends along the way instead of a reverse at the end.
subl(L, N) -> subl(L, N, []).
subl(_, 0, Accumulator) ->
Accumulator;
subl([], _, Accumulator) ->
Accumulator;
subl([H|T], N, Accumulator) ->
subl(T, N-1, Accumulator ++ [H]).
I would not say that "the use of reverse in Erlang is very frequent". I would say that the use of reverse is very common in toy problems in functional languages where lists are a significant data type.
I'm not sure how close to your Oz code you're trying to get with your "is it possible to create a similar code in Erlang? If not, why?" They are two different languages and have made many different syntax choices.
I wrote the follwing function:
let str2lst str =
let rec f s acc =
match s with
| "" -> acc
| _ -> f (s.Substring 1) (s.[0]::acc)
f str []
How can I know if the F# compiler turned it into a loop? Is there a way to find out without using Reflector (I have no experience with Reflector and I Don't know C#)?
Edit: Also, is it possible to write a tail recursive function without using an inner function, or is it necessary for the loop to reside in?
Also, Is there a function in F# std lib to run a given function a number of times, each time giving it the last output as input? Lets say I have a string, I want to run a function over the string then run it again over the resultant string and so on...
Unfortunately there is no trivial way.
It is not too hard to read the source code and use the types and determine whether something is a tail call by inspection (is it 'the last thing', and not in a 'try' block), but people second-guess themselves and make mistakes. There's no simple automated way (other than e.g. inspecting the generated code).
Of course, you can just try your function on a large piece of test data and see if it blows up or not.
The F# compiler will generate .tail IL instructions for all tail calls (unless the compiler flags to turn them off is used - used for when you want to keep stack frames for debugging), with the exception that directly tail-recursive functions will be optimized into loops. (EDIT: I think nowadays the F# compiler also fails to emit .tail in cases where it can prove there are no recursive loops through this call site; this is an optimization given that the .tail opcode is a little slower on many platforms.)
'tailcall' is a reserved keyword, with the idea that a future version of F# may allow you to write e.g.
tailcall func args
and then get a warning/error if it's not a tail call.
Only functions that are not naturally tail-recursive (and thus need an extra accumulator parameter) will 'force' you into the 'inner function' idiom.
Here's a code sample of what you asked:
let rec nTimes n f x =
if n = 0 then
x
else
nTimes (n-1) f (f x)
let r = nTimes 3 (fun s -> s ^ " is a rose") "A rose"
printfn "%s" r
I like the rule of thumb Paul Graham formulates in On Lisp: if there is work left to do, e.g. manipulating the recursive call output, then the call is not tail recursive.