Spawning 1000 processes at the same time in Erlang - erlang

I want to spawn 1000 or a variable number of processes in Erlang.
server.erl:
-module(server).
-export([start/2]).
start(LeadingZeroes, InputString) ->
% io:format("Leading Zeroes: ~w", [LeadingZeroes]),
% io:format("InputString: ~p", [InputString]).
mineCoins(LeadingZeroes, InputString, 100).
mineCoins(LeadingZeroes, InputString, Target) ->
PID = spawn(miner, findTargetHash(), []), % How to spawn this process 1000 times so that each process computes something and sends the results here
PID ! {self(), {mine, LeadingZeroes, InputString, Target}},
receive
{found, Number} ->
io:fwrite("Rectangle area: ~w", [Number]);
% {square, Area} ->
% io:fwrite("Square area: ~w", [Area]);
Other ->
io:fwrite("In Other!")
end.
% io:fwrite("Yolo: ~w", [Square_Area]).
miner.erl (client):
-module(miner).
-export([findTargetHash/0]).
findTargetHash() ->
receive
{From , {mine, LeadingZeroes, InputString, Target}} ->
% do something here
From ! {found, Number};
{From, {else, X}} ->
io:fwrite("In Else area"),
From ! {square, X*X}
end,
findTargetHash().
Here, I wish to spawn the processes, 1000 of them(miner), how does one achieve this? Through list comprehensions or recursion or any other way?

Generally, you can do something N times like this:
-module(a).
-compile(export_all).
go(0) ->
io:format("!finished!~n");
go(N) ->
io:format("Doing something: ~w~n", [N]),
go(N-1).
In the shell:
3> c(a).
a.erl:2:2: Warning: export_all flag enabled - all functions will be exported
% 2| -compile(export_all).
% | ^
{ok,a}
4> a:go(3).
Doing something: 3
Doing something: 2
Doing something: 1
!finished!
ok
If you need to start N processes and subsequently send messages to them, then you will need their pids to do that, so you will have to save their pids somewhere:
go(0, Pids) ->
io:format("All workers have been started.~n"),
Pids;
go(N, Pids) ->
Pid = spawn(b, worker, [self()]),
go(N-1, [Pid|Pids]).
-module(b).
-compile(export_all).
worker(From) ->
receive
{From, Data} ->
io:format("Worker ~w received ~w.~n", [self(), Data]),
From ! {self(), Data * 3};
Other ->
io:format("Error, received ~w.~n", [Other])
end.
To start N=3 worker processes, you would call go/2 like this:
Pids = a:go(3, []).
That's a little bit awkward for someone who didn't write the code: why do I have to pass an empty list? So, you could define a go/1 like this:
go(N) -> go(N, []).
Then, you can start 3 worker processes by simply writing:
Pids = go(3).
Next, you need to send each of the worker processes a message containing the work they need to do:
do_work([Pid|Pids], [Data|Datum]) ->
Pid ! {self(), Data},
do_work(Pids, Datum);
do_work([], []) ->
io:format("All workers have been sent their work.~n").
Finally, you need to gather the results from the workers:
gather_results([Worker|Workers], Results) ->
receive
{Worker, Result} ->
gather_results(Workers, [Result|Results])
end;
gather_results([], Results) ->
Results.
A couple of things to note about gather_results/2:
The Worker variable in the receive has already been assigned a value in the head of the function, so the receive is not waiting for just any worker process to send a message, rather the receive is waiting for a particular worker process to send a message.
The first Worker process in the list of Workers may be the longest running process, and you may wait in the receive for, say, 10 minutes for that process to finish, but then getting the results from the other worker processes will require no waiting. Therefore, gathering all the results will essentially take as long as the longest process plus a few microseconds to loop through the other processes. Similarly, for other orderings of the longest and shortest processes in the list, it will only take a time equal to the longest process plus a few microseconds to receive all the results.
Here is a test run in the shell:
27> c(a).
a.erl:2:2: Warning: export_all flag enabled - all functions will be exported
% 2| -compile(export_all).
% | ^
{ok,a}
28> c(b).
b.erl:2:2: Warning: export_all flag enabled - all functions will be exported
% 2| -compile(export_all).
% | ^
{ok,b}
29> Pids = a:go(3, []).
All workers have been started.
[<0.176.0>,<0.175.0>,<0.174.0>]
30> a:do_work(Pids, [1, 2, 3]).
All workers have been sent their work.
Worker <0.176.0> received 1.
Worker <0.175.0> received 2.
Worker <0.174.0> received 3.
ok
31> a:gather_results(Pids, []).
[9,6,3]

Related

ETS does not seem to store my insert

I'm trying to implement a process that I can query/update for some state information (I'm working on an SMS service and want to store some local data based on responses - later I will use a DB but for now I want to use ETS, this is my first Erlang project so I think it's useful to learn). Unfortunately it seems like my inserts are not coming through and I don't understand why. This is the module:
-module(st).
-compile(export_all).
maintain_state() ->
Tab = ets:new(state, [set]),
receive
{Pid, lookup, Key} ->
Pid ! ets:lookup(Tab, Key),
maintain_state();
{Pid, update, Key, Handler} ->
NewState = Handler(ets:lookup(Tab, Key)),
Status = ets:insert(Tab, NewState),
Pid ! {Status, NewState},
maintain_state();
{Pid, statelist} ->
Pid ! ets:tab2list(Tab),
maintain_state();
kill ->
void
end,
ets:delete(Tab).
start_state_maintainer() ->
Pid = spawn(st, maintain_state, []),
register(state, Pid).
update_state(StateHandler) ->
state ! {self(), update, testing, StateHandler},
receive
After ->
After
after 1000 ->
throw("Timeout in update_state")
end.
lookup_state() ->
state ! {self(), lookup, testing},
receive
Value ->
Value
after 1000 ->
throw("Timeout in lookup_state")
end.
all_state() ->
state ! {self(), statelist},
receive
Value ->
Value
after 1000 ->
throw("Timeout in all_state")
end.
Which I then load in an erl session:
> c(st).
> st:start_state_maintainer().
> st:lookup_state().
[]
> st:update_state(fun (St) -> {testing, myval} end).
{true, {testing, myval}}
> st:all_state().
[]
Since update_state shows true I figured the insert was successful, but nothing seems to be stored in the table. What am I doing wrong?
PS: if this whole approach is flawed or you have other remarks about my code I would appreciate those as well.
Ok. Let's run your code again.
1> c(st). % compile your code
{ok,st}
% Before doing anything. let's get count of all ETS tables using ets:all/0
2> length(ets:all()).
16 % So the Erlang VM has 16 tables after starting it
3> st:start_state_maintainer().
true
% Let's check count of tables again:
4> length(ets:all()).
17 % Your process has created its own table
5> st:lookup_state().
[]
% Check count of tables again
6> length(ets:all()).
18 % Why????
7> st:update_state(fun (St) -> {testing, myval} end).
{true,{testing,myval}}
8> length(ets:all()).
19
9> st:all_state().
[]
10> length(ets:all()).
20
So in line 5 in function maintain_state/0 you are creating an ETS table and in lines 9, 14 and 17 you are calling this function again ! So after receiving each message (except void) you are creating new ETS table!
Let's see those tables:
11> P = whereis(state). % Get process id of 'state' and assign it to P
<0.66.0>
12> Foreach =
fun(Tab) ->
case ets:info(Tab, owner) of
P -> % If owner of table is state's pid
io:format("Table ~p with data ~p~n"
,[Tab, ets:tab2list(Tab)]);
_ ->
ok
end
end.
#Fun<erl_eval.6.118419387>
13> lists:foreach(Foreach, ets:all()).
Table 28691 with data []
Table 24594 with data []
Table 20497 with data [{testing,myval}]
Table 16400 with data []
ok
And after killing your process, We should have 16 tables again:
14> exit(P, kill).
true
15> length(ets:all()).
16
You have two choises. You can use named tables like this:
maintain_state() ->
% With 'named_table' option, we can use the name of table in code:
Tab = ets:new(state, [set, named_table]),
maintain_state2().
maintain_state2() ->
receive
{Pid, lookup, Key} ->
Pid ! ets:lookup(state, Key), % I used name of table
maintain_state2();
...
Or use table as argument of maintain_state2:
maintain_state() ->
Tab = ets:new(state, [set]),
maintain_state2(Tab).
maintain_state2(Tab) ->
receive
{Pid, lookup, Key} ->
Pid ! ets:lookup(Tab, Key),
maintain_state2(Tab);
...
I changed the code to one of above examples and here is the result:
1> st:start_state_maintainer().
true
2> st:lookup_state().
[]
3> st:update_state(fun (St) -> {testing, myval} end).
{true,{testing,myval}}
4> st:all_state().
[{testing,myval}]
5> length(ets:all()).
17
After playing with Erlang's message passing and understanding its functionality and its concepts, I really suggest you to learn OTP design principles and OTP behaviors like gen_server and use them instead of writing your own receive ... and Pid ! ... statements.

How exactly Erlang receive expression works?

Why receive expression is sometimes called selective receive?
What is the "save queue"?
How the after section works?
There is a special "save queue" involved in the procedure that when you first encounter the receive expression you may ignore its presence.
Optionally, there may be an after-section in the expression that complicates the procedure a little.
The receive expression is best explained with a flowchart:
receive
pattern1 -> expressions1;
pattern2 -> expressions2;
pattern3 -> expressions3
after
Time -> expressionsTimeout
end
Why receive expression is sometimes called selective receive?
-module(my).
%-export([test/0, myand/2]).
-compile(export_all).
-include_lib("eunit/include/eunit.hrl").
start() ->
spawn(my, go, []).
go() ->
receive
{xyz, X} ->
io:format("I received X=~w~n", [X])
end.
In the erlang shell:
1> c(my).
my.erl:3: Warning: export_all flag enabled - all functions will be exported
{ok,my}
2> Pid = my:start().
<0.79.0>
3> Pid ! {hello, world}.
{hello,world}
4> Pid ! {xyz, 10}.
I received X=10
{xyz,10}
Note how there was no output for the first message that was sent, but there was output for the second message that was sent. The receive was selective: it did not receive all messages, it received only messages matching the specified pattern.
What is the "save queue"?
-module(my).
%-export([test/0, myand/2]).
-compile(export_all).
-include_lib("eunit/include/eunit.hrl").
start() ->
spawn(my, go, []).
go() ->
receive
{xyz, X} ->
io:format("I received X=~w~n", [X])
end,
io:format("What happened to the message that didn't match?"),
receive
Any ->
io:format("It was saved rather than discarded.~n"),
io:format("Here it is: ~w~n", [Any])
end.
In the erlang shell:
1> c(my).
my.erl:3: Warning: export_all flag enabled - all functions will be exported
{ok,my}
2> Pid = my:start().
<0.79.0>
3> Pid ! {hello, world}.
{hello,world}
4> Pid ! {xyz, 10}.
I received X=10
What happened to the message that didn't match?{xyz,10}
It was saved rather than discarded.
Here it is: {hello,world}
How the after section works?
-module(my).
%-export([test/0, myand/2]).
-compile(export_all).
-include_lib("eunit/include/eunit.hrl").
start() ->
spawn(my, go, []).
go() ->
receive
{xyz, X} ->
io:format("I received X=~w~n", [X])
after 10000 ->
io:format("I'm not going to wait all day for a match. Bye.")
end.
In the erlang shell:
1> c(my).
my.erl:3: Warning: export_all flag enabled - all functions will be exported
{ok,my}
2> Pid = my:start().
<0.79.0>
3> Pid ! {hello, world}.
{hello,world}
I'm not going to wait all day. Bye.4>
Another example:
-module(my).
%-export([test/0, myand/2]).
-compile(export_all).
-include_lib("eunit/include/eunit.hrl").
sleep(X) ->
receive
after X * 1000 ->
io:format("I just slept for ~w seconds.~n", [X])
end.
In the erlang shell:
1> c(my).
my.erl:3: Warning: export_all flag enabled - all functions will be exported
{ok,my}
2> my:sleep(5).
I just slept for 5 seconds.
ok

Can not spawn function on remote node with spawn(Node, Fun) in erlang

experimenting with distributed erlang, here's what I have:
loop()->
receive {From, ping} ->
io:format("received ping from ~p~n", [From]),
From ! pong,
loop();
{From, Fun} when is_function(Fun) ->
io:format("executing function ~p received from ~p~n", [Fun, From]),
From ! Fun(),
loop()
end.
test_remote_node_can_execute_sent_clojure()->
Pid = spawn(trecias, fun([])-> loop() end),
Pid ! {self(), fun()-> erlang:nodes() end},
receive Result ->
Result = [node()]
after 300 ->
timeout
end.
getting: Can not start erlang:apply,[#Fun<tests.1.123107452>,[]] on trecias
node I execute the test on runs on the same machine as the node 'trecias'. Both nodes can load same code.
Any ideas what is amiss?
In the spawn call, you've specified the node name as trecias, but you need to specify the full node name including the hostname, e.g. trecias#localhost.
Also, the function you pass to spawn/2 must take zero arguments, but the one in the code above takes one argument (and crashes if that argument isn't the empty list). Write it as fun() -> loop() end instead.
When spawning an anonymous function on a remote node, you also need to make sure that the module is loaded on both nodes, with the same version. Otherwise you'll get a badfun error.

Erlang, try to make gen_server: call with many responses

Try to use OTP-style in project and got one OTP-interface question. What solution is more popular/beautiful?
What I have:
web-server with mochiweb
one process, what spawns many (1000-2000) children.
Children contain state (netflow-speed). Process proxies messages to children and create new children, if need.
In mochiweb I have one page with speed of all actors, how whey made:
nf_collector ! {get_abonents_speed, self()},
receive
{abonents_speed_count, AbonentsCount} ->
ok
end,
%% write http header, chunked
%% and while AbonentsCount != 0, receive speed and write http
This is not-opt style, how i can understand. Solutions:
In API synchronous function get all requests with speed and return list with all speeds. But I want write it to client at once.
One argument of API-function is callback:
nf_collector:get_all_speeds(fun (Speed) -> Resp:write_chunk(templater(Speed)) end)
Return iterator:
One of results of get_all_speeds will be function with receive-block. Every call of it will return {ok, Speed}, at the end it return {end}.
get_all_speeds() ->
nf_collector ! {get_abonents_speed, self()},
receive
{abonents_speed_count, AbonentsCount} ->
ok
end,
{ok, fun() ->
create_receive_fun(AbonentsCount)
end}.
create_receive_fun(0)->
{end};
create_receive_fun(Count)->
receive
{abonent_speed, Speed} ->
Speed
end,
{ok, Speed, create_receive_fun(Count-1)}.
Spawn your 'children' from a supervisor:
-module(ch_sup).
-behaviour(supervisor).
-export([start_link/0, init/1, start_child/1]).
start_link() -> supervisor:start_link({local, ?MODULE}, ?MODULE, []).
init([]) -> {ok, {{simple_one_for_one}, [{ch, {ch, start_link, []}, transient, 1000, worker, [ch]}]}}.
start_child(Data) -> supervisor:start_child(?MODULE, [Data]).
Start them with ch_sup:start_child/1 (Data is whatever).
Implement your children as a gen_server:
-module(ch).
-behaviour(gen_server).
-record(?MODULE, {speed}).
...
get_speed(Pid, Timeout) ->
try
gen_server:call(Pid, get, Timeout)
catch
exit:{timeout, _} -> timeout;
exit:{noproc, _} -> died
end
.
...
handle_call(get, _From, St) -> {reply, {ok, St#?MODULE.speed}, St} end.
You can now use the supervisor to get the list of running children and query them, though you have to accept the possibility of a child dying between getting the list of children and calling them, and obviously a child could for some reason be alive but not respond, or respond with an error, etc.
The get_speed/2 function above returns either {ok, Speed} or died or timeout. It remains for you to filter appropriately according to your applications needs; easy with a list comprehension, here's a few.
Just the speeds:
[Speed || {ok, Speed} <- [ch:get_speed(Pid, 1000) || Pid <-
[Pid || {undefined, Pid, worker, [ch]} <-
supervisor:which_children(ch_sup)
]
]].
Pid and speed tuples:
[{Pid, Speed} || {Pid, {ok, Speed}} <-
[{Pid, ch:get_speed(Pid, 1000)} || Pid <-
[Pid || {undefined, Pid, worker, [ch]} <-
supervisor:which_children(ch_sup)]
]
].
All results, including timeouts and 'died' results for children that died before you got to them:
[{Pid, Any} || {Pid, Any} <-
[{Pid, ch:get_speed(Pid, 1000)} || Pid <-
[Pid || {undefined, Pid, worker, [ch]} <-
supervisor:which_children(ch_sup)]
]
].
In most situations you almost certainly don't want anything other than the speeds, because what are you going to do about deaths and timeouts? You want those that die to be respawned by the supervisor, so the problem is more or less fixed by the time you know about it, and timeouts, as with any fault, are a separate problem, to be dealt with in whatever way you see fit... There's no need to mix the fault fixing logic with the data retrieval logic though.
Now, the problem with all these, which I think you were getting at in your post, but I'm not quite sure, is that the timeout of 1000 is for each call, and each call is synchronous one after the other, so for 1000 children with a 1 second timeout, it could take 1000 seconds to produce no results. Making time timeout 1ms might be the answer, but to do it properly is a bit more complicated:
get_speeds() ->
ReceiverPid = self(),
Ref = make_ref(),
Pids = [Pid || {undefined, Pid, worker, [ch]} <-
supervisor:which_children(ch_sup)],
lists:foreach(
fun(Pid) -> spawn(
fun() -> ReceiverPid ! {Ref, ch:get_speed(Pid, 1000)} end
) end,
Pids),
receive_speeds(Ref, length(Pids), os_milliseconds(), 1000)
.
receive_speeds(_Ref, 0, _StartTime, _Timeout) ->
[];
receive_speeds(Ref, Remaining, StartTime, Timeout) ->
Time = os_milliseconds(),
TimeLeft = Timeout - Time + StartTime,
receive
{Ref, acc_timeout} ->
[];
{Ref, {ok, Speed}} ->
[Speed | receive_speeds(Ref, Remaining-1, StartTime, Timeout)];
{Ref, _} ->
receive_speeds(Ref, Remaining-1, StartTime, Timeout)
after TimeLeft ->
[]
end
.
os_milliseconds() ->
{OsMegSecs, OsSecs, OsMilSecs} = os:timestamp(),
round(OsMegSecs*1000000 + OsSecs + OsMilSecs/1000)
.
Here each call is spawned in a different process and the replies collected, until the 'master timeout' or they have all been received.
Code has largely been cut-n-pasted from various works I have lying round, and edited manually and by search replace, to anonymise it and remove surplus, so it's probably mostly compilable quality, but I don't promise I didn't break anything.

Can't spawn with number parameter?

I'm a beginner at Erlang and I've been working through "Learn You Some Erlang For Great Good!". I use a modified version of this example code where the critic has a parameter:
critic(Count) ->
receive
{From, {"Rage Against the Turing Machine", "Unit Testify"}} ->
From ! {self(), {"They are great!", Count}};
{From, {"System of a Downtime", "Memoize"}} ->
From ! {self(), {"They're not Johnny Crash but they're good.", Count}};
{From, {"Johnny Crash", "The Token Ring of Fire"}} ->
From ! {self(), {"Simply incredible.", Count}};
{From, {_Band, _Album}} ->
From ! {self(), {"They are terrible!", Count}}
end,
critic(Count).
Which is spawned like this:
restarter() ->
process_flag(trap_exit, true),
Pid = spawn_link(?MODULE, critic, [my_atom]),
register(critic, Pid),
receive
{'EXIT', Pid, normal} -> % not a crash
ok;
{'EXIT', Pid, shutdown} -> % manual termination, not a crash
ok;
{'EXIT', Pid, _} ->
restarter()
end.
The module is used like this:
1> c(linkmon).
{ok,linkmon}
2> Monitor = linkmon:start_critic().
<0.163.0>
3> linkmon:judge("Rage Against the Turing Machine", "Unit Testify").
{"They are great!",my_atom}
Now, when I change "my_atom" to a simple number (like 255) the monitor crashes:
1> c(linkmon).
{ok,linkmon}
2> Monitor = linkmon:start_critic().
=ERROR REPORT==== 14-Jul-2013::20:42:20 ===
Error in process <0.173.0> with exit value: {badarg,[{erlang,register,[critic,<0.174.0>] []},{linkmon,restarter,0,[{file,"linkmon.erl"},{line,16}]}]}
However, it does work when I send [1] (so the code is "spawn(....., [[255]]).")
Why can't I pass a single number? Is just skimming over the documentation of spawn/3 doesn't really tell me anything... except maybe that I missed something and a number is not an Erlang term. But then how do I pass a number?
The error message says that the call to register(critic, Pid) on line 16 crashes due to "badarg" even though the arguments look ok. This can happen if the process referred to by Pid is already dead (if it crashes immediately, e.g. if you pass the wrong number of args), or if you already have a process around using that name. Ensure that the length of the list in the spawn(Mod,Fun,[...]) matches the number of args to your critic() function, and call "whereis(critic)" in the shell to check if there's an old process blocking the name from being reused.

Resources