Related
The title^ is kinda confusing but I will illustrate what I want to achieve:
I have:
[{<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538077790705827">>},
{<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538078530667847">>},
{<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538077778390908">>},
{<<"5b71d7e458c37fa04a7ce768">>,<<"5bad45b1e990057961313822">>,<<"1538082492283531">>
}]
I want to convert it to a list like this:
[
{<<"5b3f77502dfe0deeb8912b42">>,
[{<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538077790705827">>},
{<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538078530667847">>},
{<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538077778390908">>}
]},
{<<"5bad45b1e990057961313822">>,
[{<<"5b71d7e458c37fa04a7ce768">>,<<"5bad45b1e990057961313822">>,<<"1538082492283531">>}
]}
]
List of tuples [{id, [<List>]}, {id2, [<List>]} ] where ids are the second item of the tuple of the original list
Example :
<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538077790705827">>
Erlang newbie here. I created a dict with the second members of the tuples as keys and lists of corresponding tuples as values, then used dict:fold to transform it into the expected output format.
-export([test/0, transform/1]).
transform([H|T]) ->
transform([H|T], dict:new()).
transform([], D) ->
lists:reverse(
dict:fold(fun (Key, Tuples, Acc) ->
lists:append(Acc,[{Key,Tuples}])
end,
[],
D));
transform([Tuple={_S1,S2,_S3}|T], D) ->
transform(T, dict:append_list(S2, [Tuple], D)).
test() ->
Input=[{<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538077790705827">>},
{<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538078530667847">>},
{<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538077778390908">>},
{<<"5b71d7e458c37fa04a7ce768">>,<<"5bad45b1e990057961313822">>,<<"1538082492283531">>}
],
Output=transform(Input),
case Output of
[
{<<"5b3f77502dfe0deeb8912b42">>,
[{<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538077790705827">>},
{<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538078530667847">>},
{<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538077778390908">>}
]},
{<<"5bad45b1e990057961313822">>,
[{<<"5b71d7e458c37fa04a7ce768">>,<<"5bad45b1e990057961313822">>,<<"1538082492283531">>}
]}
] -> ok;
_Else -> error
end.
I think I see what you're after... Please correct me if I'm wrong.
There are a number of ways to do this, it really just depends on what sort of data structure you're interested in using to check the presence of like-keys. I'll show you two fundamentally different ways to do this and a third hybrid method that has become recently available:
Indexed data types (in this case a map)
List operations with matching
Hybrid matching over map keys
Since you're new I'll use the first case to demonstrate two ways of writing it: explicit recursion and using an actual list function from the lists module.
Indexy Data Types
The first way we'll do this is to use a hash table (aka "dict", "map", "hash", "K/V", etc.) and explicitly recurse through the elements, checking for the presence of the key encountered and adding it if it is missing, or appending to the list of values it points to if it does. We'll use an Erlang map for this. At the end of the function we'll convert the utility map back to a list:
explicit_convert(List) ->
Map = explicit_convert(List, maps:new()),
maps:to_list(Map).
explicit_convert([H | T], A) ->
K = element(2, H),
NewA =
case maps:is_key(K, A) of
true ->
V = maps:get(K, A),
maps:put(K, [H | V], A);
false ->
maps:put(K, [H], A)
end,
explicit_convert(T, NewA);
explicit_convert([], A) ->
A.
There is nothing wrong with explicit recursion (it is particularly good if you're new, because every part of it is left in the open to be examined), but this is a "left fold" and we already have a library function that abstracts a little bit of the plumbing out. So we really only need to write a function that checks for the presence of an element, and adds the key or appends the value:
fun_convert(List) ->
Map = lists:foldl(fun convert/2, maps:new(), List),
maps:to_list(Map).
convert(H, A) ->
K = element(2, H),
case maps:is_key(K, A) of
true ->
V = maps:get(K, A),
maps:put(K, [H | V], A);
false ->
maps:put(K, [H], A)
end.
Listy Conversion
The other major way we could have done this is with listy matching. To do that you need to first guarantee that your elements are sorted on the element you want to use as a key so that you can use it as a sort of "working element" and match on it. The code should be pretty easy to understand once you stare at it for a bit (maybe write out how it will step through your list by hand on paper once if you're totally perplexed):
listy_convert(List) ->
[T = {_, K, _} | Rest] = lists:keysort(2, List),
listy_convert(Rest, {K, [T]}, []).
listy_convert([T = {_, K, _} | Rest], {K, Ts}, Acc) ->
listy_convert(Rest, {K, [T | Ts]}, Acc);
listy_convert([T = {_, K, _} | Rest], Done, Acc) ->
listy_convert(Rest, {K, [T]}, [Done | Acc]);
listy_convert([], Done, Acc) ->
[Done | Acc].
Note that we split the list immediately after sorting it. The reason is that we have "prime the pump", so to speak, on the first call we make to listy_convert/3. This also means that this function will crash if you pass it an empty list. You can solve that by adding a clause to listy_convert/1 that matches on the empty list [].
A Final Bit of Magic
With those firmly in mind... consider that we also have a bit of a hybrid option available in newer versions of Erlang due to the magical syntax available to maps. We can match (most values) on map keys inside of a case clause (though we can't unify on a key value provided by other arguments within a function head):
map_convert(List) ->
maps:to_list(map_convert(List, #{})).
map_convert([T = {_, K, _} | Rest], Acc) ->
case Acc of
#{K := Ts} -> map_convert(Rest, Acc#{K := [T | Ts]});
_ -> map_convert(Rest, Acc#{K => [T]})
end;
map_convert([], Acc) ->
Acc.
Here is a one-liner that would produce your expected result:
[{K, [E || {_, K2, _} = E <- List, K =:= K2]} || {_, K, _} <- lists:ukeysort(2, List)].
What’s going on here? Let’s do it step by step…
This is your original list
List = […],
lists:ukeysort/2 leaves just one element per key in the list
OnePerKey = lists:ukeysort(2, List),
We then extract the keys with the first list comprehension
Keys = [K || {_, K, _} <- OnePerKey],
With the second list comprehension, we find the elements with the key…
fun Filter(K, List) ->
[E || {_, K2, _} = E <- List, K =:= K2]
end
Keep in mind that we can’t just pattern-match with K in the generator (i.e. [E || {_, K, _} = E <- List]) because generators in LCs introduce new scope for the variables.
Finally, putting all together…
[{K, Filter(K, List)} || K <- Keys]
It really depends on your dataset. For lager data sets using maps is a bit more efficient.
-module(test).
-export([test/3, v1/2, v2/2, v3/2, transform/1, do/2]).
test(N, Keys, Size) ->
List = [{<<"5b71d7e458c37fa04a7ce768">>,rand:uniform(Keys),<<"1538077790705827">>} || I <- lists:seq(1,Size)],
V1 = timer:tc(test, v1, [N, List]),
V2 = timer:tc(test, v2, [N, List]),
V3 = timer:tc(test, v3, [N, List]),
io:format("V1 took: ~p, V2 took: ~p V3 took: ~p ~n", [V1, V2, V3]).
v1(N, List) when N > 0 ->
[{K, [E || {_, K2, _} = E <- List, K =:= K2]} || {_, K, _} <- lists:ukeysort(2, List)],
v1(N-1, List);
v1(_,_) -> ok.
v2(N, List) when N > 0 ->
do(List,maps:new()),
v2(N-1, List);
v2(_,_) -> ok.
v3(N, List) when N > 0 ->
transform(List),
v3(N-1, List);
v3(_,_) -> ok.
do([], R) -> maps:to_list(R);
do([H={_,K,_}|T], R) ->
case maps:get(K,R,null) of
null -> NewR = maps:put(K, [H], R);
V -> NewR = maps:update(K, [H|V], R)
end,
do(T, NewR).
transform([H|T]) ->
transform([H|T], dict:new()).
transform([], D) ->
lists:reverse(
dict:fold(fun (Key, Tuples, Acc) ->
lists:append(Acc,[{Key,Tuples}])
end,
[],
D));
transform([Tuple={_S1,S2,_S3}|T], D) ->
transform(T, dict:append_list(S2, [Tuple], D)).
Running both with 100 unique keys and 100,000 records I get:
> test:test(1,100,100000).
V1 took: {75566,ok}, V2 took: {32087,ok} V3 took: {887362,ok}
ok
I couldn't find a beginner friendly answer to what the difference between the "local" and "let" keywords in SML is. Could someone provide a simple example please and explain when one is used over the other?
(TL;DR)
Use case ... of ... when you only have one temporary binding.
Use let ... in ... end for very specific helper functions.
Never use local ... in ... end. Use opaque modules instead.
Adding some thoughts on use-cases to sepp2k's fine answer:
(Summary) local ... in ... end is a declaration and let ... in ... end is an expression, so that effectively limits where they can be used: Where declarations are allowed (e.g. at the top level or inside a module), and inside value declarations (val and fun), respectively.
But so what? It often seems that either can be used. The Rosetta Stone QuickSort code, for example, could be structured using either, since the helper functions are only used once:
(* First using local ... in ... end *)
local
fun par_helper([], x, l, r) = (l, r)
| par_helper(h::t, x, l, r) =
if h <= x
then par_helper(t, x, l # [h], r)
else par_helper(t, x, l, r # [h])
fun par(l, x) = par_helper(l, x, [], [])
in
fun quicksort [] = []
| quicksort (h::t) =
let
val (left, right) = par(t, h)
in
quicksort left # [h] # quicksort right
end
end
(* Second using let ... in ... end *)
fun quicksort [] = []
| quicksort (h::t) =
let
fun par_helper([], x, l, r) = (l, r)
| par_helper(h::t, x, l, r) =
if h <= x
then par_helper(t, x, l # [h], r)
else par_helper(t, x, l, r # [h])
fun par(l, x) = par_helper(l, x, [], [])
val (left, right) = par(t, h)
in
quicksort left # [h] # quicksort right
end
So let's focus on when it is particularly useful to use one or the other.
local ... in ... end is mainly used when you have one or more temporary declarations (e.g. helper functions) that you want to hide after they're used, but they should be shared between multiple non-local declarations. E.g.
(* Helper function shared across multiple functions *)
local
fun par_helper ... = ...
fun par(l, x) = par_helper(l, x, [], [])
in
fun quicksort [] = []
| quicksort (h::t) = ... par(t, h) ...
fun median ... = ... par(t, h) ...
end
If there weren't multiple, you could have used a let ... in ... end instead.
You can always avoid using local ... in ... end in favor of opaque modules (see below).
let ... in ... end is mainly used when you want to compute temporary results, or deconstruct values of product types (tuples, records), one or more times inside a function. E.g.
fun quicksort [] = []
| quicksort (x::xs) =
let
val (left, right) = List.partition (fn y => y < x) xs
in
quicksort left # [x] # quicksort right
end
Here are some of the benefits of let ... in ... end:
A binding is computed once per function call (even when used multiple times).
A binding can simultaneously be deconstructed (into left and right here).
The declaration's scope is limited. (Same argument as for local ... in ... end.)
Inner functions may use the arguments of the outer function, or the outer function itself.
Multiple bindings that depend on each other may neatly be lined up.
And so on... Really, let-expressions are quite nice.
When a helper function is used once, you might as well nest it inside a let ... in ... end.
Especially if other reasons for using one applies, too.
Some additional opinions
(case ... of ... is awesome, too.)
When you have only one let ... in ... end you can instead write e.g.
fun quicksort [] = []
| quicksort (x::xs) =
case List.partition (fn y => y < x) xs of
(left, right) => quicksort left # [x] # quicksort right
These are equivalent. You might like the style of one or the other. The case ... of ... has one advantage, though, being that it also work for sum types ('a option, 'a list, etc.), e.g.
(* Using case ... of ... *)
fun maxList [] = NONE
| maxList (x::xs) =
case maxList xs of
NONE => SOME x
| SOME y => SOME (Int.max (x, y))
(* Using let ... in ... end and a helper function *)
fun maxList [] = NONE
| maxList (x::xs) =
let
val y_opt = maxList xs
in
Option.map (fn y => Int.max (x, y)) y_opt
end
The one disadvantage of case ... of ...: The pattern block does not stop, so nesting them often requires parentheses. You can also combine the two in different ways, e.g.
fun move p1 (GameState old_p) gameMap =
let val p' = addp p1 old_p in
case getMapPos p' gameMap of
Grass => GameState p'
| _ => GameState old_p
end
This isn't so much about not using local ... in ... end, though.
Hiding declarations that won't be used elsewhere is sensible. E.g.
(* if they're overly specific *)
fun handvalue hand =
let
fun handvalue' [] = 0
| handvalue' (c::cs) = cardvalue c + handvalue' cs
val hv = handvalue' hand
in
if hv > 21 andalso hasAce hand
then handvalue (removeAce hand) + 1
else hv
end
(* to cover over multiple arguments, e.g. to achieve tail-recursion, *)
(* or because the inner function has dependencies anyways (here: x). *)
fun par(ys, x) =
let fun par_helper([], l, r) = (l, r)
| par_helper(h::t, l, r) =
if h <= x
then par_helper(t, l # [h], r)
else par_helper(t, l, r # [h])
in par_helper(ys, [], []) end
And so on. Basically,
If a declaration (e.g. function) will be re-used, don't hide it.
If not, the point of local ... in ... end over let ... in ... end is void.
(local ... in ... end is useless.)
You never want to use local ... in ... end. Since its job is to isolate one set of helper declarations to a subset of your main declarations, this forces you to group those main declarations according to what they depend on, rather than perhaps a more desired order.
A better alternative is simply to write a structure, give it a signature and make that signature opaque. That way, all internal declarations can be used freely throughout the module without being exported.
One example of this in j4cbo's SML on Stilts web-framework is the module StaticServer: It exports only val server : ..., even though the structure also holds the two declarations structure U = WebUtil and val content_type = ....
structure StaticServer :> sig
val server: { basepath: string,
expires: LargeInt.int option,
headers: Web.header list } -> Web.app
end = struct
structure U = WebUtil
val content_type = fn
"png" => "image/png"
| "gif" => "image/gif"
| "jpg" => "image/jpeg"
| "css" => "text/css"
| "js" => "text/javascript"
| "html" => "text/html"
| _ => "text/plain"
fun server { basepath, expires, headers } (req: Web.request) = ...
end
The short answer is: local is a declaration, let is an expression. Consequently, they are used in different syntactic contexts, and local requires declarations between in and end, while let requires an expression there. It's not much deeper than that.
As #SimonShine mentioned, local is often discouraged in favour of using modules.
I'm trying to make a sumif function in Erlang that would return a sum of all elements in a list if the predicate function evaluates to true. Here is what I have:
sumif(_, []) -> undefined;
sumif(Fun, [H|T]) -> case Fun(H) of
true -> H + sumif(Fun, T);
false -> sumif(Fun, T)
end.
I also implemented my own pos function which returns true if a number is greater than 0 and false otherwise:
pos(A) -> A > 0.
I tried using pos with sumif but I'm getting this error:
exception error: bad function pos
Why is this happening? Is it because of my sumif function or pos? I have tested pos on its own and it seems to work just fine.
Edit: It might be because how I'm calling the function. This is how I'm currently calling it: hi:sumif(pos,[-1,1,2,-3]). Where hi is my module name.
Is it because of my sumif function or pos?
It's because of sumif. You should return 0 when an empty list is passed, as it'll be called from the 2nd clause when T is []:
-module(a).
-compile(export_all).
sumif(_, []) -> 0;
sumif(Fun, [H|T]) -> case Fun(H) of
true -> H + sumif(Fun, T);
false -> sumif(Fun, T)
end.
pos(A) -> A > 0.
Test:
1> c(a).
{ok,a}
2> a:sumif(fun a:pos/1, [-4, -2, 0, 2, 4]).
6
List comprehensions make things far simpler:
sumif(F, L) ->
lists:sum([X || X <- L, F(X)]).
Dobert's answer is of cousrse right, problem is your sum for empty list.
If your concern is performance a little bit you should stick to tail recursive solution (in this case it matter because there is not lists:reverse/1 involved).
sumif(F, L) ->
sumif(F, L, 0).
sumif(F, [], Acc) when is_function(F, 1) -> Acc;
sumif(F, [H|T], Acc) ->
New = case F(H) of
true -> H+Acc;
false -> Acc
end,
sumif(F, T, New).
Ways how to make correct function for first parameter:
F1 = fun pos/1, % inside module where pos/1 defined
F2 = fun xyz:pos/1, % exported function from module xyz (hot code swap works)
N = 0,
F3 = fun(X) -> X > N end, % closure
% test it
true = lists:all(fun(F) -> is_function(F, 1) end, [F1, F2, F3]).
There has tow error in your code:
1. sumif(_, []) -> undefined; should return 0, not undefined.
2. when you pass pos(A) -> A > 0. to sumif/2,you should use fun pos/1, please read http://erlang.org/doc/programming_examples/funs.html#id59138
sumif(F, L) ->
lists:foldl(fun(X, Sum) when F(X) -> Sum+X; (_) -> Sum end, 0, L).
You can use lists:foldl.
The following is my solution to Project Euler 14, which works (in 18 s):
%Which starting number, under one million, produces the longest Collartz chain?
-module(soln14).
-export([solve/0]).
collatz(L) ->
[H|T] = L,
F = erlang:get({'collatz', H}),
case is_list(F) of
true ->
R = lists:append(F, T);
false ->
if H == 1 ->
R = L;
true ->
if H rem 2 == 0 ->
R = collatz([H div 2 | L]);
true ->
R = collatz([3*H+1 | L])
end
end,
erlang:put({'collatz', lists:last(L)}, R),
R
end.
dosolve(N, Max, MaxN, TheList) ->
if N == 1000000 -> MaxN;
true ->
L = collatz([N]),
M = length(L),
if M > Max -> dosolve(N+1, M, N, L);
true ->
dosolve(N+1, Max, MaxN, TheList)
end
end.
solve() ->
{Megass, Ss, Micros} = erlang:timestamp(),
S = dosolve(1, -1, 1, []),
{Megase, Se, Microe} = erlang:timestamp(),
{Megase-Megass, Se-Ss, Microe-Micros, S}.
However, the compiler complains:
8> c(soln14).
soln14.erl:20: Warning: variable 'R' is unused
{ok,soln14}
9> soln14:solve().
{0,18,-386776,837799}
Is this a compiler scoping error, or do I have a legit bug?
It's not a compiler error, just a warning that in the true case of "case is_list(F) of", the bindning of R to the result of lists:append() is pointless, since this value of R will not be used after that point, just returned immediately. I'll leave it to you to figure out if that's a bug or not. It may be that you are fooled by your indentation. The lines "erlang:put(...)," and "R" are both still within the "false" case of "case is_list(F) of", and should be deeper indented to reflect this.
The error message and the code are not "synchronized". with the version you give, the warning is on line 10: R = lists:append(F, T);.
What it means is that you bind the result of the lists:append/2 call to R and that you don't use it later in the true statement.
this is not the case in the false statement since you use R in the function erlang:put/2.
You could write the code this way:
%Which starting number, under one million, produces the longest Collartz chain?
-module(soln14).
-export([solve/0,dosolve/4]).
collatz(L) ->
[H|T] = L,
F = erlang:get({'collatz', H}),
case is_list(F) of
true ->
lists:append(F, T);
false ->
R = if H == 1 ->
L;
true ->
if H rem 2 == 0 ->
collatz([H div 2 | L]);
true ->
collatz([3*H+1 | L])
end
end,
erlang:put({'collatz', lists:last(L)}, R),
R
end.
dosolve(N, Max, MaxN, TheList) ->
if N == 1000000 -> MaxN;
true ->
L = collatz([N]),
M = length(L),
if M > Max -> dosolve(N+1, M, N, L);
true ->
dosolve(N+1, Max, MaxN, TheList)
end
end.
solve() ->
timer:tc(?MODULE,dosolve,[1, -1, 1, []]).
Warning the code uses a huge amount of memory, collatz is not tail recursive, and it seems that there is some garbage collecting witch is not done.
I have the following setup:
1> rd(rec, {name, value}).
rec
2> L = [#rec{name = a, value = 1}, #rec{name = b, value = 2}, #rec{name = c, value = 3}].
[#rec{name = a,value = 1},
#rec{name = b,value = 2},
#rec{name = c,value = 3}]
3> M = [#rec{name = a, value = 111}, #rec{name = c, value = 333}].
[#rec{name = a,value = 111},#rec{name = c,value = 333}]
The elements in list L are unique based on their name. I also don't know the previous values of the elements in list M. What I am trying to do is to update list L with the values in list M, while keeping the elements of L that are not present in M. I did the following:
update_values([], _M, Acc) ->
Acc;
update_attributes_from_fact([H|T], M, Acc) ->
case [X#rec.value || X <- M, X#rec.name =:= H#rec.name] of
[] ->
update_values(T, M, [H|Acc]);
[NewValue] ->
update_values(T, M, [H#rec{value = NewValue}|Acc])
end.
It does the job but I wonder if there is a simpler method that uses bifs.
Thanks a lot.
There's no existing function that does this for you, since you just want to update the value field rather than replacing the entire record in L (like lists:keyreplace() does). If both L and M can be long, I recommend that if you can, you change L from a list to a dict or gb_tree using #rec.name as key. Then you can loop over M, and for each element in M, look up the correct entry if there is one and write back the updated record. The loop can be written as a fold. Even if you convert the list L to a dict first and convert it back again after the loop, it will be more efficient than the L*M approach. But if M is always short and you don't want to keep L as a dict in the rest of the code, your current approach is good.
Pure list comprehensions solution:
[case [X||X=#rec{name=XN}<-M, XN=:=N] of [] -> Y; [#rec{value =V}|_] -> Y#rec{value=V} end || Y=#rec{name=N} <- L].
little bit more effective using lists:keyfind/3:
[case lists:keyfind(N,#rec.name,M) of false -> Y; #rec{value=V} -> Y#rec{value=V} end || Y=#rec{name=N} <- L].
even more effective for big M:
D = dict:from_list([{X#rec.name, X#rec.value} || X<-M]),
[case dict:find(N,D) of error -> Y; {ok,V} -> Y#rec{value=V} end || Y=#rec{name=N} <- L].
but for really big M this approach can be fastest:
merge_join(lists:keysort(#rec.name, L), lists:ukeysort(#rec.name, M)).
merge_join(L, []) -> L;
merge_join([], _) -> [];
merge_join([#rec{name=N}=Y|L], [#rec{name=N, value=V}|_]=M) -> [Y#rec{value=V}|merge_join(L,M)];
merge_join([#rec{name=NL}=Y|L], [#rec{name=NM}|_]=M) when NL<NM -> [Y|merge_join(L,M)];
merge_join(L, [_|M]) -> merge_join(L, M).
You could use lists:ukeymerge/3:
lists:ukeymerge(#rec.name, M, L).
Which:
returns the sorted list formed by merging TupleList1 and TupleList2.
The merge is performed on the Nth element of each tuple. Both
TupleList1 and TupleList2 must be key-sorted without duplicates prior
to evaluating this function. When two tuples compare equal, the tuple
from TupleList1 is picked and the one from TupleList2 deleted.
A record is a tuple and you can use #rec.name to return the position of the key in a transparent way. Note that I reverted the lists L and M, since the function keeps the value from the first list.