I am currently playing around with minimal web servers, like Cowboy. I want to pass a number in the URL, load lines of a file, sort these lines and print the element in the middle to test IO and sorting.
So the code loads the path like /123, makes a padded "00123" out of the number, loads the file "input00123.txt" and sorts its content and then returns something like "input00123.txt 0.50000".
At the sime time I have a test tool which makes 50 simultaneous requests, where only 2 get answered, the rest times out.
My handler looks like the following:
-module(toppage_handler).
-export([init/3]).
-export([handle/2]).
-export([terminate/3]).
init(_Transport, Req, []) ->
{ok, Req, undefined}.
readlines(FileName) ->
{ok, Device} = file:open(FileName, [read]),
get_all_lines(Device, []).
get_all_lines(Device, Accum) ->
case io:get_line(Device, "") of
eof -> file:close(Device), Accum;
Line -> get_all_lines(Device, Accum ++ [Line])
end.
handle(Req, State) ->
{PathBin, _} = cowboy_req:path(Req),
case PathBin of
<<"/">> -> Output = <<"Hello, world!">>;
_ -> PathNum = string:substr(binary_to_list(PathBin),2),
Num = string:right(PathNum, 5, $0),
Filename = string:concat("input",string:concat(Num, ".txt")),
Filepath = string:concat("../data/",Filename),
SortedLines = lists:sort(readlines(Filepath)),
MiddleIndex = erlang:trunc(length(SortedLines)/2),
MiddleElement = lists:nth(MiddleIndex, SortedLines),
Output = iolist_to_binary(io_lib:format("~s\t~s",[Filename,MiddleElement]))
end,
{ok, ReqRes} = cowboy_req:reply(200, [], Output, Req),
{ok, ReqRes, State}.
terminate(_Reason, _Req, _State) ->
ok.
I am running this on Windows to compare it with .NET. Is there anything to make this more performant, like running the sorting/IO in threads or how can I improve it? Running with cygwin didn't change the result a lot, I got about 5-6 requests answered.
Thanks in advance!
The most glaring issue: get_all_lines is O(N^2) because list concatenation (++) is O(N). Erlang list type is a singly linked list. The typical approach here is to use "cons" operator, appending to the head of the list, and reverse accumulator at the end:
get_all_lines(Device, Accum) ->
case io:get_line(Device, "") of
eof -> file:close(Device), lists:reverse(Accum);
Line -> get_all_lines(Device, [Line | Accum])
end.
Pass binary flag to file:open to use binaries instead of strings (which are just lists of characters in Erlang), they are much more memory and CPU-friendly.
Related
-module(solarSystem).
-export([process_csv/1, is_numeric/1, parseALine/2, parse/1, expandT/1, expandT/2,
parseNames/1]).
parseALine(false, T) ->
T;
parseALine(true, T) ->
T.
parse([Name, Colour, Distance, Angle, AngleVelocity, Radius, "1" | T]) ->
T;%Where T is a list of names of other objects in the solar system
parse([Name, Colour, Distance, Angle, AngleVelocity, Radius | T]) ->
T.
parseNames([H | T]) ->
H.
expandT(T) ->
T.
expandT([], Sep) ->
[];
expandT([H | T], Sep) ->
T.
% https://rosettacode.org/wiki/Determine_if_a_string_is_numeric#Erlang
is_numeric(L) ->
S = trim(L, ""),
Float = (catch erlang:list_to_float(S)),
Int = (catch erlang:list_to_integer(S)),
is_number(Float) orelse is_number(Int).
trim(A) ->
A.
trim([], A) ->
A;
trim([32 | T], A) ->
trim(T, A);
trim([H | T], A) ->
trim(T, A ++ [H]).
process_csv(L) ->
X = parse(L),
expandT(X).
The problem is that it will calls process_csv/1 function in my module in a main, L will be a file like this:
[["name "," col"," dist"," a"," angv"," r "," ..."],["apollo11 ","white"," 0.1"," 0"," 77760"," 0.15"]]
Or like this:
["planets ","earth","venus "]
Or like this:
["a","b"]
I need to display it as follows:
apollo11 =["white", 0.1, 0, 77760, 0.15,[]];
Planets =[earth,venus]
a,b
[[59],[97],[44],[98]]
My problem is that no matter how I make changes, it can only show a part, and there are no symbols. The list cannot be divided, so I can't find a way.
In addition, because Erlang is a niche programming language, I can't even find examples online.
So, can anyone help me? Thank you, very much.
In addition, I am restricted from using recursion.
I think the first problem is that it is hard to link what you are trying to achieve with what your code says thus far. Therefore, this feedback maybe is not exactly what you are looking for, but might give some ideas. Let's structure the problem into the common elements: (1) input, (2) process, and (3) output.
Input
You mentioned that L will be a file, but I assume it is a line in a file, where each line can be one of the 3 (three) samples. In this regard, the samples also do not have consistent pattern.For this, we can build a function to convert each line of the file into Erlang term and pass the result to the next step.
Process
The question also do not mention the specific logic in parsing/processing the input. You also seem to care about the data type so we will convert and display the result accordingly. Erlang as a functional language will naturally be handling list, so on most cases we will need to use functions on lists module
Output
You didn't specifically mention where you want to display the result (an output file, screen/erlang shell, etc), so let's assume you just want to display it in the standard output/erlang shell.
Sample file content test1.txt (please note the dot at the end of each line)
[["name "," col"," dist"," a"," angv"," r "],["apollo11 ","white","0.1"," 0"," 77760"," 0.15"]].
["planets ","earth","venus "].
["a","b"].
Howto run: solarSystem:process_file("/Users/macbook/Documents/test1.txt").
Sample Result:
(dev01#Macbooks-MacBook-Pro-3)3> solarSystem:process_file("/Users/macbook/Documents/test1.txt").
apollo11 = ["white",0.1,0,77760,0.15]
planets = ["earth","venus"]
a = ["b"]
Done processing 3 line(s)
ok
Module code:
-module(solarSystem).
-export([process_file/1]).
-export([process_line/2]).
-export([format_item/1]).
%%This is the main function, input is file full path
%%Howto call: solarSystem:process_file("file_full_path").
process_file(Filename) ->
%%Use file:consult to convert the file content into erlang terms
%%File content is a dot (".") separated line
{StatusOpen, Result} = file:consult(Filename),
case StatusOpen of
ok ->
%%Result is a list and therefore each element must be handled using lists function
Ctr = lists:foldl(fun process_line/2, 0, Result),
io:format("Done processing ~p line(s) ~n", [Ctr]);
_ -> %%This is for the case where file not available
io:format("Error converting file ~p due to '~p' ~n", [Filename, Result])
end.
process_line(Term, CtrIn) ->
%%Assume there are few possibilities of element. There are so many ways to process the data as long as the input pattern is clear.
%%We basically need to identify all possibilities and handle them accordingly.
%%Of course there are smarter (dynamic) ways to handle them, but below may give you some ideas.
case Term of
%%1. This is to handle this pattern -> [["name "," col"," dist"," a"," angv"," r "],["apollo11 ","white"," 0.1"," 0"," 77760"," 0.15"]]
[[_, _, _, _, _, _], [Name | OtherParams]] ->
%%At this point, Name = "apollo11", OtherParamsList = ["white"," 0.1"," 0"," 77760"," 0.15"]
OtherParamsFmt = lists:map(fun format_item/1, OtherParams),
%%Display the result to standard output
io:format("~s = ~p ~n", [string:trim(Name), OtherParamsFmt]);
%%2. This is to handle this pattern -> ["planets ","earth","venus "]
[Name | OtherParams] ->
%%At this point, Name = "planets ", OtherParamsList = ["earth","venus "]
OtherParamsFmt = lists:map(fun format_item/1, OtherParams),
%%Display the result to standard output
io:format("~s = ~p ~n", [string:trim(Name), OtherParamsFmt]);
%%3. Other cases
_ ->
%%Display the warning to standard output
io:format("Unknown pattern ~p ~n", [Term])
end,
CtrIn + 1.
%%This is to format the string accordingly
format_item(Str) ->
StrTrim = string:trim(Str), %%first, trim it
format_as_needed(StrTrim).
format_as_needed(Str) ->
Float = (catch erlang:list_to_float(Str)),
case Float of
{'EXIT', _} -> %%It is not a float -> check if it is an integer
Int = (catch erlang:list_to_integer(Str)),
case Int of
{'EXIT', _} -> %%It is not an integer -> return as is (string)
Str;
_ -> %%It is an int
Int
end;
_ -> %%It is a float
Float
end.
When executing an implementation of the Tarry distributed algorithm, a problem occurs that I don't know how to address: a crash containing the error {undef,[{rand,uniform,[2],[]}. My module is below:
-module(assign2_ex).
-compile(export_all).
%% Tarry's Algorithm with depth-first version
start() ->
Out = get_lines([]),
Nodes = createNodes(tl(Out)),
Initial = lists:keyfind(hd(Out), 1, Nodes),
InitialPid = element(2, Initial),
InitialPid ! {{"main", self()}, []},
receive
{_, List} ->
Names = lists:map(fun(X) -> element(1, X) end, List),
String = lists:join(" ", lists:reverse(Names)),
io:format("~s~n", [String])
end.
get_lines(Lines) ->
case io:get_line("") of
%% End of file, reverse the input for correct order
eof -> lists:reverse(Lines);
Line ->
%% Split each line on spaces and new lines
Nodes = string:tokens(Line, " \n"),
%% Check next line and add nodes to the result
get_lines([Nodes | Lines])
end.
%% Create Nodes
createNodes(List) ->
NodeNames = [[lists:nth(1, Node)] || Node <- List],
Neighbours = [tl(SubList) || SubList <- List],
Pids = [spawn(assign2_ex, midFunction, [Name]) || Name <-NodeNames],
NodeIDs = lists:zip(NodeNames, Pids),
NeighbourIDs = [getNeighbours(N, NodeIDs) || N <- lists:zip(NodeIDs, Neighbours)],
[Pid ! NeighbourPids || {{_, Pid}, NeighbourPids} <- NeighbourIDs],
NodeIDs.
getNeighbours({{Name, PID}, NeighboursForOne}, NodeIDs) ->
FuncMap = fun(Node) -> lists:keyfind([Node], 1, NodeIDs) end,
{{Name, PID}, lists:map(FuncMap, NeighboursForOne)}.
midFunction(Node) ->
receive
Neighbours -> tarry_depth(Node, Neighbours, [])
end.
%% Tarry's Algorithm with depth-first version
%% Doesn't visit the nodes which have been visited
tarry_depth(Name, Neighbours, OldParent) ->
receive
{Sender, Visited} ->
Parent = case OldParent of [] -> [Sender]; _ -> OldParent end,
Unvisited = lists:subtract(Neighbours, Visited),
Next = case Unvisited of
[] -> hd(Parent);
_ -> lists:nth(rand:uniform(length(Unvisited)), Unvisited)
end,
Self = {Name, self()},
element(2, Next) ! {Self, [Self | Visited]},
tarry_depth(Name, Neighbours, Parent)
end.
An undef error means that the program tried to call an undefined function. There are three reasons that this can happen for:
There is no module with that name (in this case rand), or it cannot be found and loaded for some reason
The module doesn't define a function with that name and arity. In this case, the function in question is uniform with one argument. (Note that in Erlang, functions with the same name but different numbers of arguments are considered separate functions.)
There is such a function, but it isn't exported.
You can check the first by typing l(rand). in an Erlang shell, and the second and third by running rand:module_info(exports)..
In this case, I suspect that the problem is that you're using an old version of Erlang/OTP. As noted in the documentation, the rand module was introduced in release 18.0.
Will be good if you provide the version of Erlang/OTP you are using for future questions as Erlang has changed a lot over the years. As far as i know there is no rand:uniform with arity 2 at least in recent Erlang versions and that is what you are getting the undef error, for that case you could use crypto:rand_uniform/2 like crypto:rand_uniform(Low, High). Hope this helps :)
I have the following functions:
search(DirName, Word) ->
NumberedFiles = list_numbered_files(DirName),
Words = make_filter_mapper(Word),
Index = mapreduce(NumberedFiles, Words, fun remove_duplicates/3),
dict:find(Word, Index).
list_numbered_files(DirName) ->
{ok, Files} = file:list_dir(DirName),
FullFiles = [ filename:join(DirName, File) || File <- Files ],
Indices = lists:seq(1, length(Files)),
lists:zip(Indices, FullFiles). % {Index, FileName} tuples
make_filter_mapper(MatchWord) ->
fun (_Index, FileName, Emit) ->
{ok, [Words]} = file:consult(FileName), %% <---- Line 20
lists:foreach(fun (Word) ->
case MatchWord == Word of
true -> Emit(Word, FileName);
false -> false
end
end, Words)
end.
remove_duplicates(Word, FileNames, Emit) ->
UniqueFiles = sets:to_list(sets:from_list(FileNames)),
lists:foreach(fun (FileName) -> Emit(Word, FileName) end, UniqueFiles).
However, when i call search(Path_to_Dir, Word) I get:
Error in process <0.185.0> with exit value:
{{badmatch,{error,{1,erl_parse,["syntax error before: ","wordinfile"]}}},
[{test,'-make_filter_mapper/1-fun-1-',4,[{file,"test.erl"},{line,20}]}]}
And I do not understand why. Any ideas?
The Words variable will match to content of the list, which might not be only one tuple, but many of them. Try to match {ok, Words} instead of {ok, [Words]}.
Beside the fact that the function file:consult/1 may return a list of several elements so you should replace {ok,[Words]} (expecting a list of one element = Words) by {ok,Words}, it actually returns a syntax error meaning that in the file you are reading, there is a syntax error.
Remember that the file should contain only valid erlang terms, each of them terminated by a dot. The most common error is to forget a dot or replace it by a comma.
I asked a similar question, not sure what wasnt clear about it but I'll try again. I have a file. File name is file.txt, I read file.txt in to a list. Now I can print this to the console and it will show:
blah
blah
blah
blah
That is fine. Perfect :) Now how would I forward that to a new file? so that the new file contains:
blah
blah
blah
blah
Nothing more and nothing less. Here is the code I am using to read a file in to a list:
{ok, Device} = file:open("file.txt", [read]),
Li = readdata(Device, []).
readdata(Device, Accum) ->
case io:get_line(Device, "") of
eof -> file:close(Device), Accum;
Line -> readdata(Device, Accum ++ [Line])
end.
So again, the new file with display EXACTLY what the file I read displays, no extra characters, not all on 1 line..etc.. just the same :)
Well, the easy way is:
ok = file:write_file("output.txt", Li).
As you may see in http://www.erlang.org/doc/man/file.html , there are plenty of useful functions like file:read_file/1 that may shorten your program and at the same time make it a little quicker.
You see, the way you combine read data with accumulator is not perfect because it requires copying of Accum values, so the complexity of your readdata/2 function is N^2. Appending to the head of the list is the best way but of course you'd have to store lines as values of Acc and reverse it in the end.
And what about the length of the file? If it is huge and doesn't fit into memory, you'll have problems using even working with accumulator properly. The standard way in this case is to open both files, read some chunk of data and immediately write it to the output.
copy_file() ->
{ok, In} = file:open("input", [read]),
{ok, Out} = file:open("output", [write]),
copy_file(In, Out),
file:close(In),
file:close(Out).
copy_file(In, Out) ->
case file:read(In, 1024 * 64) of
{ok, Data} ->
ok = file:write(Out, Data),
copy_file(In, Out);
_ ->
ok
end.
I haven't tried the code, it may not compile, I just tried to show the basic idea.
So this is what I came up with. I modified your readdata/2 slightly to optimize the append and remove the newline. The write/2 function uses lists:foreach/2 and io:fwrite/3 to write to the file.
-module(rwlist).
-export([read/1,write/2]).
read(FileName) ->
case file:open(FileName, [read]) of
{ok, Device} ->
readdata(Device, [])
end.
readdata(Device, Accum) ->
case io:get_line(Device, "") of
eof -> file:close(Device), lists:reverse(Accum);
Line -> readdata(Device, [(Line--"\n")|Accum])
end.
write(FileName, List) ->
case file:open(FileName, [write]) of
{ok, Device} ->
lists:foreach(fun(Line) -> writeline(Device, Line) end, List),
file:close(Device)
end.
writeline(Device, Line) -> writeline(Device, Line, os:type()).
writeline(Device, Line, {win32,_}) -> io:fwrite(Device, "~s\r\n", [Line]);
writeline(Device, Line, _) -> io:fwrite(Device, "~s\n", [Line]).
Here's the test...
57> List=rwlist:read("list").
["item 1","item 2","item 3","item 4"]
58> rwlist:write("list2", List).
ok
59> List2=rwlist:read("list2").
["item 1","item 2","item 3","item 4"]
Of course if you are just copying a file Dmitry's answer is better.
I need to put data in a file since my other function takes a file as input.
How do I create a unique filename in Erlang?
Does something like unix "tempfile" exist?
Do you mean just generate the acutal filename? In that case the safest way would be to use a mix of the numbers you get from now() and the hostname of your computer (if you have several nodes doing the same thing).
Something like:
1> {A,B,C}=now().
{1249,304278,322000}
2> N=node().
nonode#nohost
3> lists:flatten(io_lib:format("~p-~p.~p.~p",[N,A,B,C])).
"nonode#nohost-1249.304278.322000"
4>
You can also use TMP = lib:nonl(os:cmd("mktemp")).
Or you could do
erlang:phash2(make_ref())
for a quick and easy unique indentifier. Unique for up to 2^82 calls which should be enough.for your purposes. I find this easier than formatting a timestamp with node name for use.
Late answer: I just noticed the test_server module which has scratch directory support, worth a look
http://www.erlang.org/doc/man/test_server.html#temp_name-1
I've finally had this problem -- and my user is using a mix of Windows and Linux systems, so the old tried-and-true lib:nonl(os:cmd("mktemp")) method is just not going to cut it anymore.
So here is how I've approached it, both with a mktemp/1 function that returns a filename that can be used and also a mktemp_dir/1 function that returns a directory (after having created it).
-spec mktemp(Prefix) -> Result
when Prefix :: string(),
Result :: {ok, TempFile :: file:filename()}
| {error, Reason :: file:posix()}.
mktemp(Prefix) ->
Rand = integer_to_list(binary:decode_unsigned(crypto:strong_rand_bytes(8)), 36),
TempPath = filename:basedir(user_cache, Prefix),
TempFile = filename:join(TempPath, Rand),
Result1 = filelib:ensure_dir(TempFile),
Result2 = file:write_file(TempFile, <<>>),
case {Result1, Result2} of
{ok, ok} -> {ok, TempFile};
{ok, Error} -> Error;
{Error, _} -> Error
end.
And the directory version:
-spec mktemp_dir(Prefix) -> Result
when Prefix :: string(),
Result :: {ok, TempDir :: file:filename()}
| {error, Reason :: file:posix()}.
mktemp_dir(Prefix) ->
Rand = integer_to_list(binary:decode_unsigned(crypto:strong_rand_bytes(8)), 36),
TempPath = filename:basedir(user_cache, Prefix),
TempDir = filename:join(TempPath, Rand),
Result1 = filelib:ensure_dir(TempDir),
Result2 = file:make_dir(TempDir),
case {Result1, Result2} of
{ok, ok} -> {ok, TempDir};
{ok, Error} -> Error;
{Error, _} -> Error
end.
Both of these do basically the same thing: we get a strongly random name as a binary, convert that to a base36 string, and append it to whatever the OS returns to us as a safe user-local temporary cache location.
On a unix type system, of course, we could just use filename:join(["/tmp", Prefix, Rand]) but the unavailability of /tmp on Windows is sort of the whole point here.
In OTP 24 there is not file:ensure_dir. So I've made something similar:
For directory:
mktemp_dir(Prefix) ->
Rand = integer_to_list(binary:decode_unsigned(crypto:strong_rand_bytes(8)), 36),
TempDir = filename:basedir(user_cache, Prefix),
[]= os:cmd("mkdir " ++ "\"" ++ TempDir ++ "\""),
{ok, _} = file:list_dir(TempDir),
TempDir.
For file:
mktemp(Prefix) ->
Rand = integer_to_list(binary:decode_unsigned(crypto:strong_rand_bytes(8)), 36),
TempDir = filename:basedir(user_cache, Prefix),
TempFile = filename:join(TempDir, Rand),
[]= os:cmd("mkdir " ++ "\"" ++ TempDir ++ "\""),
{ok, _} = file:list_dir(TempDir),
Result = file:write_file(TempFile, <<>>),
case {Result} of
{ok} -> {ok, TempFile};
{Error} -> Error
end.