When pattern matching maps in Erlang, why is this variable unbound? - erlang

-module(count).
-export([count/1]).
count(L) when is_list(L) ->
do_count(L, #{});
count(_) ->
error(badarg).
do_count([], Acc) -> Acc;
do_count([H|T], #{}) -> do_count(T, #{ H => 1 });
do_count([H|T], Acc = #{ H := C }) -> do_count(T, Acc#{ H := C + 1});
do_count([H|T], Acc) -> do_count(T, Acc#{ H => 1 }).
In this example, the third clause where the map key "H" exists and has a count associated with it, will not compile. The compiler complains:
count.erl:11: variable 'H' is unbound
Why is H unbound?
This works by the way:
do_count([], Acc) -> Acc;
do_count([H|T], Acc) -> do_count(T, maps:update_with(H, fun(C) -> C + 1 end, 1, Acc)).
But it seems like the pattern match ought to work and it doesn't.

The answer is pretty much the same as the one I recently gave here:
https://stackoverflow.com/a/46268109/240949.
When you use the same variable multiple times in a pattern, as with H in this case:
do_count([H|T], Acc = #{ H := C }) -> ...
the semantics of pattern matching in Erlang say that this is as if you had written
do_count([H|T], Acc = #{ H1 := C }) when H1 =:= H -> ...
that is, they are first bound separately, then compared for equality. But a key in a map pattern needs to be known - it can't be a variable like H1, hence the error (exactly as for field size specifiers in binary patterns, in the answer I linked to).
The main difference in this question is that you have a function head with two separate arguments, and you might think that the pattern [H|T] should be matched first, binding H before the second pattern is tried, but there is no such ordering guarantee; it's just as if you had used a single argument with a tuple pattern {[H|T], #{ H := C }}.

Because that kind of match occurs out of context for unification. In fact though it doesn't explicitly forbid this in the docs, the docs do explicitly state only that matches with literals will work in function heads. I believe that there is an effort under way to make this construction work, but not yet.
The issues surrounding unification VS assignment in different contexts within function heads is related to another question about matching internal size values within binaries in function heads that came up the other day.
(Remember, the function head is not just doing assignment, it is also trying to efficiently pick a path of execution. So this isn't actually a straightforward issue.)
All that said, a more Erlangish (and simpler) version of your count/1 function could be:
count(Items) ->
count(Items, #{}).
count([], A) ->
A;
count([H | T], A) ->
NewA = maps:update_with(H, fun(V) -> V + 1 end, 1, A),
count(T, NewA).
The case you are writing against was forseen by the stdlib, and we have a nifty solution in the maps module called maps:update_with/4.
Note that we didn't name count/2 a new name. Unless necessary in the program, it is usually easier to name a helper function with a different arity the same thing when doing explicit recursion. A function's identity is Name/Arity, so these are two totally separate functions whether or not the label is the same. Also, notice that we didn't check the argument type because we have an explicit match in count/2 that can only ever match a list and so will throw a bad_arg exception anyway.
Sometimes you will want polymorphic arguments in Erlang, and typechecking is appropriate. You almost never want defensive code in Erlang, though.
Session with a module called foo:
1> c(foo).
{ok,foo}
2> foo:count([1,3,2,4,4,2,2,2,4,4,1,2]).
#{1 => 2,2 => 5,3 => 1,4 => 4}
BUT
We want to avoid explicit recursion unless there is a call for it, as we have all these nifty listy functional abstractions laying about in the stdlib. What you are really doing is trying to condense a list of values into an arbitrarily aggregated single value and that is by definition a fold. So we could rewrite the above perhaps more idiomatically as:
count2(Items) ->
Count = fun(I, A) -> maps:update_with(I, fun(V) -> V + 1 end, 1, A) end,
lists:foldl(Count, #{}, Items).
And we get:
3> foo:count2([1,3,2,4,4,2,2,2,4,4,1,2]).
#{1 => 2,2 => 5,3 => 1,4 => 4}
Regarding case...
What I wrote about unification in a function head holds -- for function heads because they are a completely blank unification context. Richard's answer provides just the best shorthand for remembering why this is crazy:
f(A, #{A := _})
is equivalent to
f(A, #{B := _}) when B =:= A
And that's just not going to fly. His comparison to tuple matching is spot on.
...but...
In a case where the primary objects have already been assigned this all works just fine. Because, as Richard helpfully mentioned in a comment, there is only one A in the case below.
1> M = #{1 => "one", 2 => "two"}.
#{1 => "one",2 => "two"}
2> F =
2> fun(A) ->
2> case M of
2> #{A := B} -> B;
2> _ -> "Oh noes! Not a key!"
2> end
2> end.
#Fun<erl_eval.6.87737649>
3> F(1).
"one"
4> F(2).
"two"
5> F(3).
"Oh noes! Not a key!"
So that may feel a bit idiosyncratic, but it makes sense based on the rules of matching/unification. And means you can write your do_count/2 the way you did above using a case inside of a function, but not as a set of function heads.

I made up this rule for myself: when using maps in the head of a function clause, the order of matching is not guaranteed. As a result, in your example you can't count on a [H|T] match to provide a value for H.
Several features of maps look like they should work, and Joe Armstrong says they should work, but they don't. It's a dumb part of erlang. Witness my incredulity here: https://bugs.erlang.org/browse/ERL-88
Simpler examples:
do_stuff(X, [X|Y]) ->
io:format("~w~n", [Y]).
test() ->
do_stuff(a, [a,b,c]).
4> c(x).
{ok,x}
5> x:test().
[b,c]
ok
But:
-module(x).
-compile(export_all).
do_stuff(X, #{X := Y}) ->
io:format("~w~n", [Y]).
test() ->
do_stuff(a, #{a => 3}).
8> c(x).
x.erl:4: variable 'X' is unbound

Related

Erlang: serial implementation of accumulator

I am trying to create a method that takes an associative and commutative operator, as well a list of values, and then returns the answer by applying an operator to the values in the list.
The following two examples represent what the input/output are supposed to look like.
Example 1
Input: sum(fun(A,B) -> A+B end, [2,6,7,10,12]).
Output: 37
Example 2
Input: sum(fun (A,B) -> A++B end , ["C", "D", "E"]).
Output: "CDE"
This is the code I am working with so far.
-module(tester).
-compile(export_all).
sum(Func, Data, Acc) ->
lists:foldr(Func, Acc, Data).
This code produces the correct result, however, there are two problems I am trying to figure out how to approach answering.
(1) In order for this code to work, it requires an empty list to be included at the end of the command line statements. In other words, if I enter the input above (as in the examples), it will err out, because I did not write it in the following way:
12> tester:sum(fun(X, Acc) -> X+Acc end, [2,6,7,10,12], 0).
How would I implement this without an empty list as in the examples above and get the same result?
(2) Also, how would the code be implemented without the list function, or in an even more serial way?
How would I implement this without an empty list as in the examples above and get the same result?
Assuming the list always has one element (you can't really do it without this assumption), you can extract the first element from the list and pass that as the initial accumulator. You'll need to switch to foldl to do this efficiently. (With foldr you'll essentially need to make a copy of the list to drop the last element.)
sum(Func, [X | Xs]) ->
lists:foldl(fun (A, B) -> Func(B, A) end, X, Xs).
1> a:sum(fun(A,B) -> A+B end, [2,6,7,10,12]).
37
2> a:sum(fun (A,B) -> A++B end , ["C", "D", "E"]).
"CDE"
Also, how would the code be implemented without the list function, or in an even more serial way?
Here's a simple implementation using recursion and pattern matching:
sum2(Func, [X | Xs]) ->
sum2(Func, Xs, X).
sum2(Func, [], Acc) ->
Acc;
sum2(Func, [X | Xs], Acc) ->
sum2(Func, Xs, Func(Acc, X)).
We define two versions of the function. The first one extracts the head and uses that as the initial accumulator. The second one, with arity 3, does essentially what the fold functions in lists do.
After working on this for a while, this was my solution. I've left some comments about the general idea of what I did, but there's a lot more to be said.
-module(erlang2).
-compile(export_all).
-export([reduce/2]).
reduce(Func, List) ->
reduce(root, Func, List).
%When done send results to Parent
reduce(Parent, _, [A]) ->
%send to parent
Parent ! { self(), A};
%I tried this at first to take care of one el in list, but it didn't work
%length ([]) ->
% Parent ! {self(), A};
%get contents of list, apply function and store in Parent
reduce(Parent, Func, List) ->
{ Left, Right } = lists:split(trunc(length(List)/2), List),
Me = self(),
%io:format("Splitting in two~n"),
Pl = spawn(fun() -> reduce(Me, Func, Left) end),
Pr = spawn(fun() -> reduce(Me, Func, Right) end),
%merge results in parent and call Func on final left and right halves
combine(Parent, Func,[Pl, Pr]).
%merge pl and pl and combine in parent
combine(Parent, Func, [Pl, Pr]) ->
%wait for processes to complete (using receive) and then send to Parent
receive
{ Pl, Sorted } -> combine(Parent, Func, Pr, Sorted);
{ Pr, Sorted } -> combine(Parent, Func, Pl, Sorted)
end.
combine(Parent, Func, P, List) ->
%wait and store in results and then call ! to send
receive
{ P, Sorted } ->
Results = Func(Sorted, List),
case Parent of
root ->
Results;
%send results to parent
_ -> Parent ! {self(), Results}
end
end.

Counting characters example in Programming Erlang, 2nd E [duplicate]

-module(count).
-export([count/1]).
count(L) when is_list(L) ->
do_count(L, #{});
count(_) ->
error(badarg).
do_count([], Acc) -> Acc;
do_count([H|T], #{}) -> do_count(T, #{ H => 1 });
do_count([H|T], Acc = #{ H := C }) -> do_count(T, Acc#{ H := C + 1});
do_count([H|T], Acc) -> do_count(T, Acc#{ H => 1 }).
In this example, the third clause where the map key "H" exists and has a count associated with it, will not compile. The compiler complains:
count.erl:11: variable 'H' is unbound
Why is H unbound?
This works by the way:
do_count([], Acc) -> Acc;
do_count([H|T], Acc) -> do_count(T, maps:update_with(H, fun(C) -> C + 1 end, 1, Acc)).
But it seems like the pattern match ought to work and it doesn't.
The answer is pretty much the same as the one I recently gave here:
https://stackoverflow.com/a/46268109/240949.
When you use the same variable multiple times in a pattern, as with H in this case:
do_count([H|T], Acc = #{ H := C }) -> ...
the semantics of pattern matching in Erlang say that this is as if you had written
do_count([H|T], Acc = #{ H1 := C }) when H1 =:= H -> ...
that is, they are first bound separately, then compared for equality. But a key in a map pattern needs to be known - it can't be a variable like H1, hence the error (exactly as for field size specifiers in binary patterns, in the answer I linked to).
The main difference in this question is that you have a function head with two separate arguments, and you might think that the pattern [H|T] should be matched first, binding H before the second pattern is tried, but there is no such ordering guarantee; it's just as if you had used a single argument with a tuple pattern {[H|T], #{ H := C }}.
Because that kind of match occurs out of context for unification. In fact though it doesn't explicitly forbid this in the docs, the docs do explicitly state only that matches with literals will work in function heads. I believe that there is an effort under way to make this construction work, but not yet.
The issues surrounding unification VS assignment in different contexts within function heads is related to another question about matching internal size values within binaries in function heads that came up the other day.
(Remember, the function head is not just doing assignment, it is also trying to efficiently pick a path of execution. So this isn't actually a straightforward issue.)
All that said, a more Erlangish (and simpler) version of your count/1 function could be:
count(Items) ->
count(Items, #{}).
count([], A) ->
A;
count([H | T], A) ->
NewA = maps:update_with(H, fun(V) -> V + 1 end, 1, A),
count(T, NewA).
The case you are writing against was forseen by the stdlib, and we have a nifty solution in the maps module called maps:update_with/4.
Note that we didn't name count/2 a new name. Unless necessary in the program, it is usually easier to name a helper function with a different arity the same thing when doing explicit recursion. A function's identity is Name/Arity, so these are two totally separate functions whether or not the label is the same. Also, notice that we didn't check the argument type because we have an explicit match in count/2 that can only ever match a list and so will throw a bad_arg exception anyway.
Sometimes you will want polymorphic arguments in Erlang, and typechecking is appropriate. You almost never want defensive code in Erlang, though.
Session with a module called foo:
1> c(foo).
{ok,foo}
2> foo:count([1,3,2,4,4,2,2,2,4,4,1,2]).
#{1 => 2,2 => 5,3 => 1,4 => 4}
BUT
We want to avoid explicit recursion unless there is a call for it, as we have all these nifty listy functional abstractions laying about in the stdlib. What you are really doing is trying to condense a list of values into an arbitrarily aggregated single value and that is by definition a fold. So we could rewrite the above perhaps more idiomatically as:
count2(Items) ->
Count = fun(I, A) -> maps:update_with(I, fun(V) -> V + 1 end, 1, A) end,
lists:foldl(Count, #{}, Items).
And we get:
3> foo:count2([1,3,2,4,4,2,2,2,4,4,1,2]).
#{1 => 2,2 => 5,3 => 1,4 => 4}
Regarding case...
What I wrote about unification in a function head holds -- for function heads because they are a completely blank unification context. Richard's answer provides just the best shorthand for remembering why this is crazy:
f(A, #{A := _})
is equivalent to
f(A, #{B := _}) when B =:= A
And that's just not going to fly. His comparison to tuple matching is spot on.
...but...
In a case where the primary objects have already been assigned this all works just fine. Because, as Richard helpfully mentioned in a comment, there is only one A in the case below.
1> M = #{1 => "one", 2 => "two"}.
#{1 => "one",2 => "two"}
2> F =
2> fun(A) ->
2> case M of
2> #{A := B} -> B;
2> _ -> "Oh noes! Not a key!"
2> end
2> end.
#Fun<erl_eval.6.87737649>
3> F(1).
"one"
4> F(2).
"two"
5> F(3).
"Oh noes! Not a key!"
So that may feel a bit idiosyncratic, but it makes sense based on the rules of matching/unification. And means you can write your do_count/2 the way you did above using a case inside of a function, but not as a set of function heads.
I made up this rule for myself: when using maps in the head of a function clause, the order of matching is not guaranteed. As a result, in your example you can't count on a [H|T] match to provide a value for H.
Several features of maps look like they should work, and Joe Armstrong says they should work, but they don't. It's a dumb part of erlang. Witness my incredulity here: https://bugs.erlang.org/browse/ERL-88
Simpler examples:
do_stuff(X, [X|Y]) ->
io:format("~w~n", [Y]).
test() ->
do_stuff(a, [a,b,c]).
4> c(x).
{ok,x}
5> x:test().
[b,c]
ok
But:
-module(x).
-compile(export_all).
do_stuff(X, #{X := Y}) ->
io:format("~w~n", [Y]).
test() ->
do_stuff(a, #{a => 3}).
8> c(x).
x.erl:4: variable 'X' is unbound

Our own tuple_to_list() function

I'm required to write my own tuple_to_list() function (yes, from the book) and came up with this in my erl file:
%% Our very own tuple_to_list function! %%
% First, the accumulator function
my_tuple_to_list_acc(T, L) -> [element(1, T) | L];
my_tuple_to_list_acc({}, L) -> L;
% Finally, the public face of the function
my_tuple_to_list(T) -> my_tuple_to_list_acc(T, []).
When I compile this, however, I get the following error in the shell:
28> c(lib_misc).
lib_misc.erl:34: head mismatch
lib_misc.erl:2: function my_tuple_to_list/1 undefined
error
I have no clue what "head mismatch" there is, and why is the function undefined (I've added it to the module export statement, though I doubt this has much to do with export statements)?
The other answer explains how to fix this, but not the reason. So: ; after a function definition clause means the next clause continues the definition, just like as for case and if branches. head mismatch means you have function clauses with different names and/or number of arguments in one definition. For the same reason, it is an error to have a clause ending with . followed by another clause with the same name and argument count.
Changing the order of the clauses is needed for a different reason, not because of the error. Clauses are always checked in order (again, same as for case and if) and your first clause already matches any two arguments. So the second would never be used.
Those errors mean that you didn't end definition of my_tuple_to_list_acc/2.
You should change order of first two code lines and add dot after them.
my_tuple_to_list_acc({}, L) -> L;
my_tuple_to_list_acc(T, L) -> [element(1, T) | L].
When you are interested in working tuple_to_list/1 implementation
1> T2L = fun (T) -> (fun F(_, 0, Acc) -> Acc; F(T, N, Acc) -> F(T, N-1, [element(N, T)|Acc]) end)(T, tuple_size(T), []) end.
#Fun<erl_eval.6.50752066>
2> T2L({}).
[]
3> T2L({a,b,c}).
[a,b,c]
Or in module
my_typle_to_list(_, 0, Acc) -> Acc;
my_typle_to_list(T, N, Acc) ->
my_typle_to_list(T, N-1, [element(N, T)|Acc]).
my_typle_to_list(T) ->
my_typle_to_list(T, tuple_size(T), []).
Note how I use decreasing index for tail recursive function.

Counting down from N to 1

I'm trying to create a list and print it out, counting down from N to 1. This is my attempt:
%% Create a list counting down from N to 1 %%
-module(list).
-export([create_list/1]).
create_list(N) when length(N)<hd(N) ->
lists:append([N],lists:last([N])-1),
create_list(lists:last([N])-1);
create_list(N) ->
N.
This works when N is 1, but otherwise I get this error:
172> list:create_list([2]).
** exception error: an error occurred when evaluating an arithmetic expression
in function list:create_list/1 (list.erl, line 6)
Any help would be appreciated.
You should generally avoid using append or ++, which is the same thing, when building lists. They both add elements to the end of a list which entails making a copy of the list every time. Sometimes it is practical but it is always faster to work at the front of the list.
It is a bit unclear in which order you wanted the list so here are two alternatives:
create_up(N) when N>=1 -> create_up(1, N). %Create the list
create_up(N, N) -> [N];
create_up(I, N) ->
[I|create_up(I+1, N)].
create_down(N) when N>1 -> %Add guard test for safety
[N|create_down(N-1)];
create_down(1) -> [1].
Neither of these are tail-recursive. While tail-recursion is nice it doesn't always give as much as you would think, especially when you need to call a reverse to get the list in the right order. See Erlang myths for more information.
The error is lists:last([N])-1. Since N is an array as your input, lists:last([N]) will return N itself. Not a number you expect. And if you see the warning when compiling your code, there is another bug: lists:append will not append the element into N itself, but in the return value. In functional programming, the value of a variable cannot be changed.
Here's my implementation:
create_list(N) ->
create_list_iter(N, []).
create_list_iter(N, Acc) ->
case N > 0 of
true -> NewAcc = lists:append(Acc, [N]),
create_list_iter(N-1, NewAcc);
false -> Acc
end.
If I correctly understand your question, here is what you'll need
create_list(N) when N > 0 ->
create_list(N, []).
create_list(1, Acc) ->
lists:reverse([1 | Acc]);
create_list(N, Acc) ->
create_list(N - 1, [N | Acc]).
If you work with lists, I'd suggest you to use tail recursion and lists construction syntax.
Also, to simplify your code - try to use pattern matching in function declarations, instead of case expressions
P.S.
The other, perhaps, most simple solution is:
create_list(N) when N > 0 ->
lists:reverse(lists:seq(1,N)).

Overuse of guards in Erlang?

I have the following function that takes a number like 5 and creates a list of all the numbers from 1 to that number so create(5). returns [1,2,3,4,5].
I have over used guards I think and was wondering if there is a better way to write the following:
create(N) ->
create(1, N).
create(N,M) when N =:= M ->
[N];
create(N,M) when N < M ->
[N] ++ create(N + 1, M).
The guard for N < M can be useful. In general, you don't need a guard for equality; you can use pattern-matching.
create(N) -> create(1, N).
create(M, M) -> [M];
create(N, M) when N < M -> [N | create(N + 1, M)].
You also generally want to write functions so they are tail-recursive, in which the general idiom is to write to the head and then reverse at the end.
create(N) -> create(1, N, []).
create(M, M, Acc) -> lists:reverse([M | Acc]);
create(N, M, Acc) when N < M -> create(N + 1, M, [N | Acc]).
(Of course, with this specific example, you can alternatively build the results in the reverse order going down to 1 instead of up to M, which would make the lists:reverse call unnecessary.)
If create/2 (or create/3) is not exported and you put an appropriate guard on create/1, the extra N < M guard might be overkill. I generally only check on the exported functions and trust my own internal functions.
create(N,N) -> [N];
create(N,M) -> [N|create(N + 1, M)]. % Don't use ++ to prefix a single element.
This isn't quite the same (you could supply -5), but it behaves the same if you supply meaningful inputs. I wouldn't bother with the extra check anyway, since the process will crash very quickly either way.
BTW, you have a recursion depth problem with the code as-is. This will fix it:
create(N) ->
create(1, N, []).
create(N, N, Acc) -> [N|Acc];
create(N, M, Acc) -> create(N, M - 1, [M|Acc]).
I don't really think you have over used guards. There are two cases:
The first is the explicit equality test in the first clause of create/2
create(N, M) when N =:= M -> [M];
Some have suggested transforming this to use pattern matching like
create(N, N) -> [N];
In this case it makes no difference as the compiler internally transforms the pattern matching version to what you have written. You can safely pick which version you think feels best in each case.
In the second case you need some form of sanity check that the value of the argument in the range you expect it to be. Doing in every loop is unnecessary and I would move it to an equivalent test in create/1:
create(M) when M > 1 -> create(1, M).
If you want to use an accumulator I would personally use the count version as it saves reversing the list at the end. If the list is not long I think the difference is very small and you can pick the version which feels most clear to you. Anyway, it is very easy to change later if you find it to be critical.

Resources