How to walk through directory in Erlang to take only folders? - erlang

-module(tut).
-export([main/0]).
main() ->
folders("C:/Users/David/test/").
folders(PATH) ->
{_,DD} = file:list_dir(PATH),
A = [{H,filelib:is_dir(PATH ++ H)}|| H <-DD],
% R is a list of all folders inside PATH
R = [PATH++X|| {X,Y} <- A, Y =:= true],
io:fwrite("~p~n", [R]),
case R of
[] -> ok;
% How call again folders function with the first element of the list?
% And save the result in some kind of structure
end.
Sorry for the beginner question, but I'm still new to Erlang. I would like to know how I can call the function again until saves the results in a kind of list, tuple or structure...
Like:
[
{"C:/Users/David/test/log",
{"C:/Users/David/test/log/a", "C:/Users/David/test/log/b"}},
{"C:/Users/David/test/logb",
{"C:/Users/David/test/logb/1", "C:/Users/David/test/logb/2","C:/Users/David/test/logb/3"}},
]

Few things:
These 2 calls can be simplified.
A = [{H,filelib:is_dir(PATH ++ H)}|| H <-DD],
R = [PATH++X|| {X,Y} <- A, Y =:= true],
into
A = [H || H <- DD, filelib:is_dir(PATH ++ H) =:= true],
In terms of representation, sub-folders should be in list format, not tuple. It will be difficult to work with if they were tuples.
Sample structure: {Folder, [Subfolder1, Subfolder2, ...]}, where SubfolderX will have the same definition and structure, recursively.
Folders are like tree, so need to have recursive call here. Hope you are already familiar with the concept. Below is one way to do it using list comprehension - there are other ways anyway, e.g. by using lists:foldl function.
folders(PATH) ->
{_, DD} = file:list_dir(PATH),
A = [H || H <- DD, filelib:is_dir(PATH ++ "/" ++ H) =:= true],
%%io:format("Path: ~p, A: ~p~n", [Path, A]),
case A of
[] -> %%Base case, i.e. folder has no sub-folders -> stop here
{PATH, []};
_ -> %%Recursive case, i.e. folder has sub-folders -> call #folders
{PATH, [folders(PATH ++ "/" ++ H2) || H2 <- A]}
end.
For consistency reason, you need to call the main function without a forward slash at the end, as this will be added in the function itself.
Folders = folders("C:/Users/David/test"). %% <- without forward slash
A helper function pretty_print below can be used to visualize the output on the Erlang shell
Full code:
-export([folders/1]).
-export([main/0]).
main() ->
Folders = folders("C:/Users/David/test"),
pretty_print(Folders, 0),
ok.
folders(PATH) ->
{_, DD} = file:list_dir(PATH),
A = [H || H <- DD, filelib:is_dir(PATH ++ "/" ++ H) =:= true], %%please note the "/" is added here
%%io:format("Path: ~p, A: ~p~n", [Path, A]),
case A of
[] -> %%Base case, i.e. folder has no sub-folders -> stop here
{PATH, []};
_ -> %%Recursive case, i.e. folder has sub-folders -> call #folders
{PATH, [folders(PATH ++ "/" ++ H2) || H2 <- A]}
end.
pretty_print(Folders, Depth) ->
{CurrrentFolder, ListSubfolders} = Folders,
SignTemp = lists:duplicate(Depth, "-"),
case Depth of
0 -> Sign = SignTemp;
_ -> Sign = "|" ++ SignTemp
end,
io:format("~s~s~n", [Sign, CurrrentFolder]),
[pretty_print(Subfolder, Depth+1) || Subfolder <- ListSubfolders].

Related

List of tuples [{id, [<List>]}, {id2, [<List>]} ] where ids are the second item of the tuple of the original list- Erlang

The title^ is kinda confusing but I will illustrate what I want to achieve:
I have:
[{<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538077790705827">>},
{<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538078530667847">>},
{<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538077778390908">>},
{<<"5b71d7e458c37fa04a7ce768">>,<<"5bad45b1e990057961313822">>,<<"1538082492283531">>
}]
I want to convert it to a list like this:
[
{<<"5b3f77502dfe0deeb8912b42">>,
[{<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538077790705827">>},
{<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538078530667847">>},
{<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538077778390908">>}
]},
{<<"5bad45b1e990057961313822">>,
[{<<"5b71d7e458c37fa04a7ce768">>,<<"5bad45b1e990057961313822">>,<<"1538082492283531">>}
]}
]
List of tuples [{id, [<List>]}, {id2, [<List>]} ] where ids are the second item of the tuple of the original list
Example :
<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538077790705827">>
Erlang newbie here. I created a dict with the second members of the tuples as keys and lists of corresponding tuples as values, then used dict:fold to transform it into the expected output format.
-export([test/0, transform/1]).
transform([H|T]) ->
transform([H|T], dict:new()).
transform([], D) ->
lists:reverse(
dict:fold(fun (Key, Tuples, Acc) ->
lists:append(Acc,[{Key,Tuples}])
end,
[],
D));
transform([Tuple={_S1,S2,_S3}|T], D) ->
transform(T, dict:append_list(S2, [Tuple], D)).
test() ->
Input=[{<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538077790705827">>},
{<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538078530667847">>},
{<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538077778390908">>},
{<<"5b71d7e458c37fa04a7ce768">>,<<"5bad45b1e990057961313822">>,<<"1538082492283531">>}
],
Output=transform(Input),
case Output of
[
{<<"5b3f77502dfe0deeb8912b42">>,
[{<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538077790705827">>},
{<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538078530667847">>},
{<<"5b71d7e458c37fa04a7ce768">>,<<"5b3f77502dfe0deeb8912b42">>,<<"1538077778390908">>}
]},
{<<"5bad45b1e990057961313822">>,
[{<<"5b71d7e458c37fa04a7ce768">>,<<"5bad45b1e990057961313822">>,<<"1538082492283531">>}
]}
] -> ok;
_Else -> error
end.
I think I see what you're after... Please correct me if I'm wrong.
There are a number of ways to do this, it really just depends on what sort of data structure you're interested in using to check the presence of like-keys. I'll show you two fundamentally different ways to do this and a third hybrid method that has become recently available:
Indexed data types (in this case a map)
List operations with matching
Hybrid matching over map keys
Since you're new I'll use the first case to demonstrate two ways of writing it: explicit recursion and using an actual list function from the lists module.
Indexy Data Types
The first way we'll do this is to use a hash table (aka "dict", "map", "hash", "K/V", etc.) and explicitly recurse through the elements, checking for the presence of the key encountered and adding it if it is missing, or appending to the list of values it points to if it does. We'll use an Erlang map for this. At the end of the function we'll convert the utility map back to a list:
explicit_convert(List) ->
Map = explicit_convert(List, maps:new()),
maps:to_list(Map).
explicit_convert([H | T], A) ->
K = element(2, H),
NewA =
case maps:is_key(K, A) of
true ->
V = maps:get(K, A),
maps:put(K, [H | V], A);
false ->
maps:put(K, [H], A)
end,
explicit_convert(T, NewA);
explicit_convert([], A) ->
A.
There is nothing wrong with explicit recursion (it is particularly good if you're new, because every part of it is left in the open to be examined), but this is a "left fold" and we already have a library function that abstracts a little bit of the plumbing out. So we really only need to write a function that checks for the presence of an element, and adds the key or appends the value:
fun_convert(List) ->
Map = lists:foldl(fun convert/2, maps:new(), List),
maps:to_list(Map).
convert(H, A) ->
K = element(2, H),
case maps:is_key(K, A) of
true ->
V = maps:get(K, A),
maps:put(K, [H | V], A);
false ->
maps:put(K, [H], A)
end.
Listy Conversion
The other major way we could have done this is with listy matching. To do that you need to first guarantee that your elements are sorted on the element you want to use as a key so that you can use it as a sort of "working element" and match on it. The code should be pretty easy to understand once you stare at it for a bit (maybe write out how it will step through your list by hand on paper once if you're totally perplexed):
listy_convert(List) ->
[T = {_, K, _} | Rest] = lists:keysort(2, List),
listy_convert(Rest, {K, [T]}, []).
listy_convert([T = {_, K, _} | Rest], {K, Ts}, Acc) ->
listy_convert(Rest, {K, [T | Ts]}, Acc);
listy_convert([T = {_, K, _} | Rest], Done, Acc) ->
listy_convert(Rest, {K, [T]}, [Done | Acc]);
listy_convert([], Done, Acc) ->
[Done | Acc].
Note that we split the list immediately after sorting it. The reason is that we have "prime the pump", so to speak, on the first call we make to listy_convert/3. This also means that this function will crash if you pass it an empty list. You can solve that by adding a clause to listy_convert/1 that matches on the empty list [].
A Final Bit of Magic
With those firmly in mind... consider that we also have a bit of a hybrid option available in newer versions of Erlang due to the magical syntax available to maps. We can match (most values) on map keys inside of a case clause (though we can't unify on a key value provided by other arguments within a function head):
map_convert(List) ->
maps:to_list(map_convert(List, #{})).
map_convert([T = {_, K, _} | Rest], Acc) ->
case Acc of
#{K := Ts} -> map_convert(Rest, Acc#{K := [T | Ts]});
_ -> map_convert(Rest, Acc#{K => [T]})
end;
map_convert([], Acc) ->
Acc.
Here is a one-liner that would produce your expected result:
[{K, [E || {_, K2, _} = E <- List, K =:= K2]} || {_, K, _} <- lists:ukeysort(2, List)].
What’s going on here? Let’s do it step by step…
This is your original list
List = […],
lists:ukeysort/2 leaves just one element per key in the list
OnePerKey = lists:ukeysort(2, List),
We then extract the keys with the first list comprehension
Keys = [K || {_, K, _} <- OnePerKey],
With the second list comprehension, we find the elements with the key…
fun Filter(K, List) ->
[E || {_, K2, _} = E <- List, K =:= K2]
end
Keep in mind that we can’t just pattern-match with K in the generator (i.e. [E || {_, K, _} = E <- List]) because generators in LCs introduce new scope for the variables.
Finally, putting all together…
[{K, Filter(K, List)} || K <- Keys]
It really depends on your dataset. For lager data sets using maps is a bit more efficient.
-module(test).
-export([test/3, v1/2, v2/2, v3/2, transform/1, do/2]).
test(N, Keys, Size) ->
List = [{<<"5b71d7e458c37fa04a7ce768">>,rand:uniform(Keys),<<"1538077790705827">>} || I <- lists:seq(1,Size)],
V1 = timer:tc(test, v1, [N, List]),
V2 = timer:tc(test, v2, [N, List]),
V3 = timer:tc(test, v3, [N, List]),
io:format("V1 took: ~p, V2 took: ~p V3 took: ~p ~n", [V1, V2, V3]).
v1(N, List) when N > 0 ->
[{K, [E || {_, K2, _} = E <- List, K =:= K2]} || {_, K, _} <- lists:ukeysort(2, List)],
v1(N-1, List);
v1(_,_) -> ok.
v2(N, List) when N > 0 ->
do(List,maps:new()),
v2(N-1, List);
v2(_,_) -> ok.
v3(N, List) when N > 0 ->
transform(List),
v3(N-1, List);
v3(_,_) -> ok.
do([], R) -> maps:to_list(R);
do([H={_,K,_}|T], R) ->
case maps:get(K,R,null) of
null -> NewR = maps:put(K, [H], R);
V -> NewR = maps:update(K, [H|V], R)
end,
do(T, NewR).
transform([H|T]) ->
transform([H|T], dict:new()).
transform([], D) ->
lists:reverse(
dict:fold(fun (Key, Tuples, Acc) ->
lists:append(Acc,[{Key,Tuples}])
end,
[],
D));
transform([Tuple={_S1,S2,_S3}|T], D) ->
transform(T, dict:append_list(S2, [Tuple], D)).
Running both with 100 unique keys and 100,000 records I get:
> test:test(1,100,100000).
V1 took: {75566,ok}, V2 took: {32087,ok} V3 took: {887362,ok}
ok

exception error: no function clause

I have added the code as it stands. It can used on any piece of text I am doing some work in Erlang and I am getting an error message which I have included below.
exception error: no function clause matching string:to_lower({error,[80,75,3,4,20,0,6,0,8,0,0,0,33,0,2020], <<210,108,90,1,0,0,32,5,0,0,19,0,8,2,91,67,111,110,116,
101,110,116,95,84,121,...>>}) (string.erl, line 2084)
in function word_sort:readlines/1 (word_sort.erl, line 17).
I have also included an extract of my code below and I would appreciate if I could get pointers on where I am going wrong.
enter code here -module(word_sort).
enter code here-export([main/1]).
-export([unique/2]).
-export([sort/1]).
-export([readlines/1]).
-export([wordCount/3]).
% ========================================================== %
% Load the file and create a list %
% ========================================================== %
readlines(FileName) ->
io:format("~nLoading File : ~p~n", [FileName]),
{ok, File} = file:read_file(FileName),
Content = unicode:characters_to_list(File),
TokenList = string:tokens(string:to_lower(Content), " .,;:!?~/>'<{}£$%^&()#-=+_[]*#\\\n\r\"0123456789"),
main(TokenList).
% ========================================================== %
% Scan through the text file and find a list of unique words %
% ========================================================== %
main(TokenList) ->
UniqueList = unique(TokenList,[]),
io:format("~nSorted List : ~n"),
SortedList = sort(UniqueList), % Sorts UniqueList into SortedList%
io:format("~nSorted List : "),
io:format("~nWriting to file~n"),
{ok, F} = file:open("unique_words.txt", [write]),
register(my_output_file, F),
U = wordCounter(SortedList,TokenList,0),
io:format("~nUnique : ~p~n", [U]),
io:fwrite("~nComplete~n").
wordCounter([H|T],TokenList,N) ->
%io:fwrite("~p \t: ~p~n", [H,T]),
wordCount(H, TokenList, 0),
wordCounter(T,TokenList,N+1);
wordCounter([], _, N) -> N.
% =============================================================%
%Word count takes the unique word, and searches the original list for occurrences of that word%
%==============================================================%
wordCount(Word,[H|T],N) ->
case Word == H of % checks to see if H is in Seen List
true -> wordCount(Word, T, N+1); % if true, N_Seen = Seen List
false -> wordCount(Word, T, N) % if false, head appends Seen List.
end;
wordCount(Word,[],N) ->
io:fwrite("~p \t: ~p ~n", [N,Word]),
io:format(whereis(my_output_file), "~p \t: ~p ~n", [N,Word]).
%=================================================================================
unique([H|T],Seen) -> % Accepts List of numbers and Seen List
case lists:member(H, Seen) of % checks to see if H is in Seen List
true -> N_Seen = Seen; % if true, N_Seen = Seen List
false -> N_Seen = Seen ++ [H] % if false, head appends Seen List.
end,
unique(T,N_Seen); % calls uniques with Tail and Seen List.
%=================================================================================
unique([],Seen) -> Seen.
sort([Pivot|T]) ->
sort([ X || X <- T, X < Pivot]) ++
[Pivot] ++
sort([ X || X <- T, X >= Pivot]);
sort([]) -> [].
unicode:characters_to_list returned some error.
Variable 'Content' contains error message instead of data.
And string:to_lower() got error message as parameter instead of string.
You need just check what characters_to_list returns to you.
readlines(FileName) ->
io:format("~nLoading File : ~p~n", [FileName]),
{ok, File} = file:read_file(FileName),
case unicode:characters_to_list(File) of
Content when is_list(Content) ->
LCcontent = string:to_lower(Content),
TokenList = string:tokens(LCcontent,
" .,;:!?~/>'<{}£$%^&()#-=+_[]*#\\\n\r\"0123456789"),
main(TokenList);
Err ->
io:format("Cannot read file, got some unicode error ~p~n", [Err])
end.

Is it possible to define a recursive function within Erlang shell?

I am reading Programming Erlang, when I type these into erlang REPL:
perms([]) -> [[]];
perms(L) -> [[H|T] || H <- L, T <- perms(L--[H])].
* 1: syntax error before: '->'
I know I cannot define functions this way in shell, so I change it to:
2> Perms = fun([]) -> [[]];(L) -> [[H|T] || H <- L, T <- Perms(L--[H])] end.
* 1: variable 'Perms' is unbound
Does this mean I cannot define a recursive function within shell?
Since OTP 17.0 there are named funs:
Funs can now be given names
More details in README:
OTP-11537 Funs can now be a given a name. Thanks to to Richard O'Keefe
for the idea (EEP37) and to Anthony Ramine for the
implementation.
1> Perms = fun F([]) -> [[]];
F(L) -> [[H|T] || H <- L, T <- F(L--[H])]
end.
#Fun<erl_eval.30.54118792>
2> Perms([a,b,c]).
[[a,b,c],[a,c,b],[b,a,c],[b,c,a],[c,a,b],[c,b,a]]
In older releases, you have to be a little bit more clever but once you get it:
1> Perms = fun(List) ->
G = fun(_, []) -> [[]];
(F, L) -> [[H|T] || H <- L, T <- F(F, L--[H])]
end,
G(G, List)
end.
#Fun<erl_eval.30.54118792>
2> Perms([a,b,c]).
[[a,b,c],[a,c,b],[b,a,c],[b,c,a],[c,a,b],[c,b,a]]

Erlang repetition string in string

I have a string:
"abc abc abc abc"
How do I calculate the number of "abc" repetitions?
If you are looking for practical and efficient implementation which will scale well for even longer substrings you can use binary:matches/2,3 which is using Boyer–Moore string search algorithm (and Aho-Corasic for multiple substrings). It obviously works only for ASCII or Latin1 strings.
repeats(L, S) -> length(binary:matches(list_to_binary(L), list_to_binary(S))).
If it is for education purposes, you can write your own less efficient version for lists of any kind. If you know substring in compile time you can use very simple and not so much bad in performance:
-define(SUBSTR, "abc").
repeats(L) -> repeats(L, 0).
repeats(?SUBSTR ++ L, N) -> repeats(L, N+1);
repeats([_|L] , N) -> repeats(L, N);
repeats([] , N) -> N.
If you don't know substring you can write a little bit more complicated and less efficient
repeats(L, S) -> repeats(L, S, 0).
repeats([], _, N) -> N;
repeats(L, S, N) ->
case prefix(L, S) of
{found, L2} -> repeats( L2, S, N+1);
nope -> repeats(tl(L), S, N)
end.
prefix([H|T], [H|S]) -> prefix(T, S);
prefix( L, [ ]) -> {found, L};
prefix( _, _ ) -> nope.
And you, of course, can try write some more sophisticated variant as simplified Boyer–Moore for lists.
1> F = fun
F([],_,_,N) -> N;
F(L,P,S,N) ->
case string:sub_string(L,1,S) == P of
true -> F(tl(string:sub_string(L,S,length(L))),P,S,N+1);
_ -> F(tl(L),P,S,N)
end
end.
#Fun<erl_eval.28.106461118>
2> Find = fun(L,P) -> F(L,P,length(P),0) end.
#Fun<erl_eval.12.106461118>
3> Find("abc abc abc abc","abc").
4
4>
this works if defined in a module, or in the shell but only with the R17.
length(lists:filter(fun(X) -> X=="abc" end, string:tokens("abc abc abc abc", " "))).

Convert clause to a Fun

How to represent this clause in one line using Fun.
perms([]) -> [[]];
perms(L) -> [[H|T] || H <- L, T <- perms(L--[H])].
I believe what you are seeking is for a fun to be "self-recursive".
The fun syntax is not able to refer to itself inside the fun body, so one need to use a trick where the fun to call is a parameter. This is commonly referred to as the ycombinator.
Some example code will likely describe it better:
permutator() ->
fun
([], _F) ->
[[]];
(L, F) ->
[ [H|T] || H <- L, T <- F(L--[H], F)]
end.
do_permutate(L) ->
P = permutator(),
P(L, P).
As you can see this is quite awkward. If you just wanted to refer to the perms functions of yours, you can use the code: fun perms/1.
I also got another answer similar to Christian.
5> Perms = fun(X) -> Fun = fun([],F) -> [[]]; (L,F) -> [[H|T] || H <- L, T <- F(L--[H],F)] end, Fun(X, Fun) end.
#Fun<erl_eval.6.13229925>
6> Perms("cat").
["cat","cta","act","atc","tca","tac"]

Resources