I have the following functions:
search(DirName, Word) ->
NumberedFiles = list_numbered_files(DirName),
Words = make_filter_mapper(Word),
Index = mapreduce(NumberedFiles, Words, fun remove_duplicates/3),
dict:find(Word, Index).
list_numbered_files(DirName) ->
{ok, Files} = file:list_dir(DirName),
FullFiles = [ filename:join(DirName, File) || File <- Files ],
Indices = lists:seq(1, length(Files)),
lists:zip(Indices, FullFiles). % {Index, FileName} tuples
make_filter_mapper(MatchWord) ->
fun (_Index, FileName, Emit) ->
{ok, [Words]} = file:consult(FileName), %% <---- Line 20
lists:foreach(fun (Word) ->
case MatchWord == Word of
true -> Emit(Word, FileName);
false -> false
end
end, Words)
end.
remove_duplicates(Word, FileNames, Emit) ->
UniqueFiles = sets:to_list(sets:from_list(FileNames)),
lists:foreach(fun (FileName) -> Emit(Word, FileName) end, UniqueFiles).
However, when i call search(Path_to_Dir, Word) I get:
Error in process <0.185.0> with exit value:
{{badmatch,{error,{1,erl_parse,["syntax error before: ","wordinfile"]}}},
[{test,'-make_filter_mapper/1-fun-1-',4,[{file,"test.erl"},{line,20}]}]}
And I do not understand why. Any ideas?
The Words variable will match to content of the list, which might not be only one tuple, but many of them. Try to match {ok, Words} instead of {ok, [Words]}.
Beside the fact that the function file:consult/1 may return a list of several elements so you should replace {ok,[Words]} (expecting a list of one element = Words) by {ok,Words}, it actually returns a syntax error meaning that in the file you are reading, there is a syntax error.
Remember that the file should contain only valid erlang terms, each of them terminated by a dot. The most common error is to forget a dot or replace it by a comma.
Related
I'm running into a problem when writing some simple erlang code for an old Advent of Code task.
The following program is supposed to read lines, group characters in a string by occurrence and then count the number of lines that have a repeat of three characters.
count_occurrences([], Map) -> Map;
count_occurrences([H | T], Map) ->
count_occurrences(T, maps:put(H, maps:get(H, Map, 0) + 1, Map)).
count(Line, Count) ->
Map = count_occurrences(Line, #{}),
case lists:member(3, maps:values(Map)) of
true -> Count + 1;
false -> Count
end.
run() ->
{ok, Binary} = file:read_file("data.txt"),
Lines = binary:split(Binary, <<"\n">>, [global]),
Result = lists:foldl(fun count/2, 0, Lines),
Result.
However, I get this error message:
10> c(day2).
{ok,day2}
11> day2:run().
** exception error: no function clause matching day2:count_occurrences(<<"bpacnmelhhzpygfsjoxtvkwuor">>,#{}) (day2.erl, line 5)
in function day2:count/2 (day2.erl, line 10)
in call from lists:foldl/3 (lists.erl, line 1263)
I don't understand why <<"bpacnmelhhzpygfsjoxtvkwuor">>,#{} doesn't match the second "count_occurrences" function clause - a string is the same as a list, right? Why doesn't it match [H | T]?
Check out this example:
-module(a).
-compile(export_all).
go([_H|_T], _X) ->
"First arg was a list";
go("a", _X) ->
"First arg was a string";
go(<<"a">>, _X) ->
"First arg was a binary".
In the shell:
5> a:go(<<"a">>, #{a=>1, b=>2}).
"First arg was a binary"
and:
6> a:go("a", #{a=>1, b=>2}).
"First arg was a list"
a string is the same as a list, right?
Yes, a double quoted string is a shortcut for creating a list of integers where the integers in the list are the ascii codes of the characters. Hence, the second function clause above will never match:
a.erl:6: Warning: this clause cannot match because a previous clause
at line 4 always matches
But....a binary, such as <<"abc">> is NOT a string, and therefore a binary is not a shortcut for creating a list of integers.
8> "a" =:= [97].
true
Okay, you knew that. But, now:
9> "a" =:= <<"a">>.
false
10> <<"a">> =:= <<97>>.
true
11> "a" =:= <<97>>.
false
And, finally:
13> <<"abc">> =:= <<97, 98, 99>>.
true
The last example shows that specifying a double quoted string inside a binary is just a shortcut for specifying a comma separated list of integers inside a binary--however specifying a double quoted string inside a binary does not somehow convert the binary to a list.
Note that you can also iterate through a binary with only slightly different syntax:
count_occurrences(<<>>, Map) -> Map;
count_occurrences(<<H, T/binary>>, Map) ->
count_occurrences(T, maps:put(H, maps:get(H, Map, 0) + 1, Map)).
By default, H is assumed to be a byte, but you can add modifiers to specify how many bits you want to select, and more. See the documentation for the Bit Syntax.
You get this error cuz function count_occurrences/2 expect first argument list - [<<"bpacnmelhhzpygfsjoxtvkwuor">>] or "bpacnmelhhzpygfsjoxtvkwuor" but was put binary - <<"bpacnmelhhzpygfsjoxtvkwuor">>. Double check input data Line in function count/2 of module day2.erl at line 10:
1> is_list([]).
true
2> is_list("").
true
3> is_list(<<"">>).
false
4> is_list(binary_to_list(<<"">>)).
true
When executing an implementation of the Tarry distributed algorithm, a problem occurs that I don't know how to address: a crash containing the error {undef,[{rand,uniform,[2],[]}. My module is below:
-module(assign2_ex).
-compile(export_all).
%% Tarry's Algorithm with depth-first version
start() ->
Out = get_lines([]),
Nodes = createNodes(tl(Out)),
Initial = lists:keyfind(hd(Out), 1, Nodes),
InitialPid = element(2, Initial),
InitialPid ! {{"main", self()}, []},
receive
{_, List} ->
Names = lists:map(fun(X) -> element(1, X) end, List),
String = lists:join(" ", lists:reverse(Names)),
io:format("~s~n", [String])
end.
get_lines(Lines) ->
case io:get_line("") of
%% End of file, reverse the input for correct order
eof -> lists:reverse(Lines);
Line ->
%% Split each line on spaces and new lines
Nodes = string:tokens(Line, " \n"),
%% Check next line and add nodes to the result
get_lines([Nodes | Lines])
end.
%% Create Nodes
createNodes(List) ->
NodeNames = [[lists:nth(1, Node)] || Node <- List],
Neighbours = [tl(SubList) || SubList <- List],
Pids = [spawn(assign2_ex, midFunction, [Name]) || Name <-NodeNames],
NodeIDs = lists:zip(NodeNames, Pids),
NeighbourIDs = [getNeighbours(N, NodeIDs) || N <- lists:zip(NodeIDs, Neighbours)],
[Pid ! NeighbourPids || {{_, Pid}, NeighbourPids} <- NeighbourIDs],
NodeIDs.
getNeighbours({{Name, PID}, NeighboursForOne}, NodeIDs) ->
FuncMap = fun(Node) -> lists:keyfind([Node], 1, NodeIDs) end,
{{Name, PID}, lists:map(FuncMap, NeighboursForOne)}.
midFunction(Node) ->
receive
Neighbours -> tarry_depth(Node, Neighbours, [])
end.
%% Tarry's Algorithm with depth-first version
%% Doesn't visit the nodes which have been visited
tarry_depth(Name, Neighbours, OldParent) ->
receive
{Sender, Visited} ->
Parent = case OldParent of [] -> [Sender]; _ -> OldParent end,
Unvisited = lists:subtract(Neighbours, Visited),
Next = case Unvisited of
[] -> hd(Parent);
_ -> lists:nth(rand:uniform(length(Unvisited)), Unvisited)
end,
Self = {Name, self()},
element(2, Next) ! {Self, [Self | Visited]},
tarry_depth(Name, Neighbours, Parent)
end.
An undef error means that the program tried to call an undefined function. There are three reasons that this can happen for:
There is no module with that name (in this case rand), or it cannot be found and loaded for some reason
The module doesn't define a function with that name and arity. In this case, the function in question is uniform with one argument. (Note that in Erlang, functions with the same name but different numbers of arguments are considered separate functions.)
There is such a function, but it isn't exported.
You can check the first by typing l(rand). in an Erlang shell, and the second and third by running rand:module_info(exports)..
In this case, I suspect that the problem is that you're using an old version of Erlang/OTP. As noted in the documentation, the rand module was introduced in release 18.0.
Will be good if you provide the version of Erlang/OTP you are using for future questions as Erlang has changed a lot over the years. As far as i know there is no rand:uniform with arity 2 at least in recent Erlang versions and that is what you are getting the undef error, for that case you could use crypto:rand_uniform/2 like crypto:rand_uniform(Low, High). Hope this helps :)
I am having Data like the below:
Data = [{<<"status">>,<<"success">>},
{<<"META">>,
{struct,[{<<"createdat">>,1406895903.0},
{<<"user_email">>,<<"gopikrishnajonnada#gmail.com">>},
{<<"campaign">>,<<"5IVUPHE42HP1NEYvKb7qSvpX2Cm">>}]}},
{<<"mode">>,1}]
And Now i am having a
FieldList = ['<<"5IVUPHE42HP1NEYvKb7qSvpX2Cm">>']
Now:
I am trying like the below but i am getting empty instead of the value
90> [L || L <- FieldList,proplists:get_value(<<"campaign">>,element(2,proplists:get_value(<<"META">>,Data,{[],[]}))) == L].
[]
so how to get the both values are equal and get the final value.
You can parse the atom as if it were an Erlang term:
atom_to_binary(Atom) ->
L = atom_to_list(Atom),
{ok, Tokens, _} = erl_scan:string(L ++ "."),
{ok, Result} = erl_parse:parse_term(Tokens),
Result.
You can then do
[L ||
L <- FieldList,
proplists:get_value(<<"campaign">>,
element(2,
proplists:get_value(<<"META">>,Data,{[],[]})))
== atom_to_binary(L)
].
You can also do it the other way round, (trying to) convert the binary to an atom using this function:
binary_literal_to_atom(Binary) ->
Literal = lists:flatten(io_lib:format("~p", [Binary])),
try
list_to_existing_atom(Literal)
catch
error:badarg -> undefined
end.
This function will return undefined if the atom is not known yet (s. Erlang: binary_to_atom filling up atom table space security issue for more information on this). This is fine here, since the match can only work if the atom was known before, in this case by being defined in the FieldList variable.
How did you get those values in the first place?
Data = [{<<"status">>,<<"success">>},
{<<"META">>,
{struct,[{<<"createdat">>,1406895903.0},
{<<"user_email">>,<<"gopikrishnajonnada#gmail.com">>},
{<<"campaign">>,<<"5IVUPHE42HP1NEYvKb7qSvpX2Cm">>}]
}
},
{<<"mode">>,1}].
[_,{_,{struct,InData}}|_] = Data.
[X || {<<"campaign">>,X} <- InData].
it gives you the result in the form : [<<"5IVUPHE42HP1NEYvKb7qSvpX2Cm">>]
of course you can use the same kind of code if the tuple {struct,InData} may be in a different place in the Data variable.
-module(wy).
-compile(export_all).
main() ->
Data = [{<<"status">>,<<"success">>},
{<<"META">>,
{struct,[{<<"createdat">>,1406895903.0},
{<<"user_email">>,<<"gopikrishnajonnada#gmail.com">>},
{<<"campaign">>,<<"5IVUPHE42HP1NEYvKb7qSvpX2Cm">>}]
}
},
{<<"mode">>,1}],
Fun = fun({<<"META">>, {struct, InData}}, Acc) ->
Value = proplists:get_value(<<"campaign">>, InData, []),
[Value | Acc];
(_Other, Acc)->
Acc
end,
lists:foldl(Fun, [], Data).
I think you can use this code.
get_currency() ->
URL = "http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20yahoo.finance.xchange%20where%20pair%20in%20(%22GBPEUR%22)&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys",
{Result, Info} = httpc:request(URL),
case Result of
error ->
{Result, Info};
ok ->
{{_Protocol, Code, _CodeStr}, _Attrs, WebData} = Info,
WebData
end.
extract_text(Content) ->
Item = hd(Content),
case element(1, Item) of
xmlText -> Item#xmlText.value;
_ -> ""
end.
analyze_info(WebData) ->
ToFind = [rate],
Parsed = element(1, xmerl_scan:string(WebData)),
Children = Parsed#xmlElement.content,
ElementList = [{El#xmlElement.name, extract_text(El#xmlElement.content)} || El <- Children, element(1, El) == xmlElement],
lists:map(fun(Item) -> lists:keyfind(Item, 1, ElementList) end, ToFind).
the above is the code im using to try to extract the contents of the tag from the url http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20yahoo.finance.xchange%20where%20pair%20in%20(%22GBPEUR%22)&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys.
here is what i do in the shell.
inets:start().
XML = scrapetest:get_currency().
scrapetest:analyze_info(XML).
and the return i get is simply "false". Im not sure what im doing wrong.
Just add some logs to your code.
Eg. adding io:format("~p~n", [ElementList]), - will show you that ElementList contains only result tag, and you should go one level deeper in your list comprehension to get tag named rate
This is common advice.
In your case, seems that better decision is recursive find function (if you want to write some code)
or use some batteries, like xmerl_xpath
Just example for another analyze_info :
analyze_info(WebData) ->
Parsed = element(1, xmerl_scan:string(WebData)),
xmerl_xpath:string("//Rate/text()", Parsed).
This will return:
[{xmlText,[{'Rate',2},{rate,1},{results,1},{query,1}],
1,[],"1.1813",text}]
I am currently playing around with minimal web servers, like Cowboy. I want to pass a number in the URL, load lines of a file, sort these lines and print the element in the middle to test IO and sorting.
So the code loads the path like /123, makes a padded "00123" out of the number, loads the file "input00123.txt" and sorts its content and then returns something like "input00123.txt 0.50000".
At the sime time I have a test tool which makes 50 simultaneous requests, where only 2 get answered, the rest times out.
My handler looks like the following:
-module(toppage_handler).
-export([init/3]).
-export([handle/2]).
-export([terminate/3]).
init(_Transport, Req, []) ->
{ok, Req, undefined}.
readlines(FileName) ->
{ok, Device} = file:open(FileName, [read]),
get_all_lines(Device, []).
get_all_lines(Device, Accum) ->
case io:get_line(Device, "") of
eof -> file:close(Device), Accum;
Line -> get_all_lines(Device, Accum ++ [Line])
end.
handle(Req, State) ->
{PathBin, _} = cowboy_req:path(Req),
case PathBin of
<<"/">> -> Output = <<"Hello, world!">>;
_ -> PathNum = string:substr(binary_to_list(PathBin),2),
Num = string:right(PathNum, 5, $0),
Filename = string:concat("input",string:concat(Num, ".txt")),
Filepath = string:concat("../data/",Filename),
SortedLines = lists:sort(readlines(Filepath)),
MiddleIndex = erlang:trunc(length(SortedLines)/2),
MiddleElement = lists:nth(MiddleIndex, SortedLines),
Output = iolist_to_binary(io_lib:format("~s\t~s",[Filename,MiddleElement]))
end,
{ok, ReqRes} = cowboy_req:reply(200, [], Output, Req),
{ok, ReqRes, State}.
terminate(_Reason, _Req, _State) ->
ok.
I am running this on Windows to compare it with .NET. Is there anything to make this more performant, like running the sorting/IO in threads or how can I improve it? Running with cygwin didn't change the result a lot, I got about 5-6 requests answered.
Thanks in advance!
The most glaring issue: get_all_lines is O(N^2) because list concatenation (++) is O(N). Erlang list type is a singly linked list. The typical approach here is to use "cons" operator, appending to the head of the list, and reverse accumulator at the end:
get_all_lines(Device, Accum) ->
case io:get_line(Device, "") of
eof -> file:close(Device), lists:reverse(Accum);
Line -> get_all_lines(Device, [Line | Accum])
end.
Pass binary flag to file:open to use binaries instead of strings (which are just lists of characters in Erlang), they are much more memory and CPU-friendly.