How to assign tasks to a remote machines actors in Erlang to achieve distributed computing? - erlang

Here is what I am trying to achieve, I have an application that is mining bitcoins. I wish to improve its functionality by introducing a remote machine which connects to the server and the server assigns some work to it.
On a single machine, I initially spawn all the processes and store their ids (of actors), then loop through the Pids and assign the mining job to them. Here each of the actor's mines and tries to find the target hash that satisfies the difficulty.
Here is the code for the same:
-module(server).
-export([start/2, spawnMiners/4, mineCoins/3]).
start(LeadingZeroes, InputString) ->
MinersPIDList = spawnMiners(1000, [], LeadingZeroes, InputString),
mineCoins(MinersPIDList, LeadingZeroes, InputString).
spawnMiners(0, PIDs, LeadingZeroes, InputString) -> PIDs;
spawnMiners(NumberOfMiners, PIDs, LeadingZeroes, InputString) ->
PID = spawn(miner, findTargetHash, [LeadingZeroes, InputString]),
spawnMiners(NumberOfMiners - 1, [PID|PIDs], LeadingZeroes, InputString).
mineCoins([], LeadingZeroes, InputString) -> io:format("");
mineCoins([PID|PIDs], LeadingZeroes, InputString) ->
PID ! {self(), {mine}},
receive
{found, InputWithNonce, GeneratedHash} ->
io:format(“Found something!”);
{not_found} ->
io:format("");
Other ->
io:fwrite("In Other!")
end,
mineCoins(PIDs, LeadingZeroes, InputString).
Here, findTargetHash(LeadingZeroes, InputString) is the task miner/ process/ actor will run and return the result if found. So if a 1000 processes are being invoked, all of these are processed on the same machine.
This is what I wish to achieve next: Imagine a new machine with an IP address is connected to this server and after connecting, the server divides these 1000 processes between this machine and the remote machine (500 each, for literally performing distributed computing). I do not know where to start and how to achieve this. Any leads will be appreciated. (Maybe something to do with gen_tcp())

Related

How do Actor Systems function to prevent memory overflow from queues but also prevent threads blocking on writing on the queues?

Actors send messages to one another. If the queues are limited, then what happens on write/send attempts to full queues? Blocking or dropping? If they are not limited, a memory crash is possible. How much is configurable?
Default mailboxes in Akka are not bounded, so will not prevent memory crash. You can however configure actors to use different mailboxes, among those there are both mailboxes that discard (pass to dead letters) messages when the max size is reached and those that block (I would not recommend to use those). You can find all mailbox implementations that comes with Akka in the docs here: https://doc.akka.io/docs/akka/current/typed/mailboxes.html#mailbox-implementations
You can test easily the behavior of the Erlang VM in this situation. In the shell:
F = fun F() -> receive done -> ok end end,
P = spawn(F),
G = fun G(Pid,Size,Wait) -> Pid ! lists:seq(1,Size), receive done -> ok after Wait -> G(Pid,Size,Wait) end end,
H = fun(Pid,Size,Wait) -> T = fun() -> G(Pid,Size,Wait) end, spawn(T) end,
D = fun D() -> io:format("~p~n~p~n",[erlang:time(),erlang:memory(processes_used)]), receive done -> ok after 10000 -> D() end end,
P1 = spawn(D).
P2 = H(P,100000,5).
You will see that you get a memory allocation exception, the VM writes a core dump and crashes.
I didn't check how to modify the limits, if you make the trial, you will see that it needs to reach a very high number of messages, using tens gigabytes of memory in the mailbox.
If you ever reach this situation, I don't think the first reaction is to increase the size, you should look first for
unread messages,
process bottleneck
application architecture
is Erlang adapted to your problem
...
actor queue in erlang not have limitation, this limited by memory size of VM, if memory size in VM is full VM crashed. for monitor or and management memory allocation and cpu load you can use os_mon in Erlang
you can test in erlang shell
F = fun() -> timer:sleep(60000),
{message_queue_len, InboxLen} = erlang:process_info(self(), message_queue_len),
io:format("Len ===> ~p", [InboxLen])
end.
PID = erlang:spawn(F).
[PID ! "hi" || _ <- lists:seq(1, 50000)].
if you increase number of message you can overflow memory
Default mailboxes in Akka are not bounded. But if you want to limit the max messages in mailboxes, you could build an Akka stream in the actor, then OverflowStrategy can be used on demand.
For example:
val source: Source[Message, SourceQueueWithComplete[Message]] =
Source.queue[Message](bufferSize = 8192,
overflowStrategy = OverflowStrategy.dropNew)

Erlang producers and consumers - strange behaviour of program

I am writing a program that solves producers-consumers problem using Erlang multiprocessing with one process responsible for handling buffer to which I produce/consume and many producers and many consumers processes. To simplify I assume producer/consumer does not know that his operation has failed (that it is impossible to produce or consume because of buffer constraints), but the server is prepared to do this.
My code is:
Server code
server(Buffer, Capacity, CountPid) ->
receive
%% PRODUCER
{Pid, produce, InputList} ->
NumberProduce = lists:flatlength(InputList),
case canProduce(Buffer, NumberProduce, Capacity) of
true ->
NewBuffer = append(InputList, Buffer),
CountPid ! lists:flatlength(InputList),
Pid ! ok,
server(NewBuffer,Capacity, CountPid);
false ->
Pid ! tryagain,
server(Buffer, Capacity, CountPid)
end;
%% CONSUMER
{Pid, consume, Number} ->
case canConsume(Buffer, Number) of
true ->
Data = lists:sublist(Buffer, Number),
NewBuffer = lists:subtract(Buffer, Data),
Pid ! {ok, Data},
server(NewBuffer, Capacity,CountPid);
false ->
Pid ! tryagain,
server(Buffer, Capacity, CountPid)
end
end.
Producer and consumer
producer(ServerPid) ->
X = rand:uniform(9),
ToProduce = [rand:uniform(500) || _ <- lists:seq(1, X)],
ServerPid ! {self(),produce,ToProduce},
producer(ServerPid).
consumer(ServerPid) ->
X = rand:uniform(9),
ServerPid ! {self(),consume,X},
consumer(ServerPid).
Starting and auxiliary functions (I enclose as I don't know where exactly my problem is)
spawnProducers(Number, ServerPid) ->
case Number of
0 -> io:format("Spawned producers");
N ->
spawn(zad2,producer,[ServerPid]),
spawnProducers(N - 1,ServerPid)
end.
spawnConsumers(Number, ServerPid) ->
case Number of
0 -> io:format("Spawned producers");
N ->
spawn(zad2,consumer,[ServerPid]),
spawnProducers(N - 1,ServerPid)
end.
start(ProdsNumber, ConsNumber) ->
CountPid = spawn(zad2, count, [0,0]),
ServerPid = spawn(zad2,server,[[],20, CountPid]),
spawnProducers(ProdsNumber, ServerPid),
spawnConsumers(ConsNumber, ServerPid).
canProduce(Buffer, Number, Capacity) ->
lists:flatlength(Buffer) + Number =< Capacity.
canConsume(Buffer, Number) ->
lists:flatlength(Buffer) >= Number.
append([H|T], Tail) ->
[H|append(T, Tail)];
append([], Tail) ->
Tail.
I am trying to count number of elements using such process, server sends message to it whenever elements are produced.
count(N, ThousandsCounter) ->
receive
X ->
if
N >= 1000 ->
io:format("Yeah! We have produced ~p elements!~n", [ThousandsCounter]),
count(0, ThousandsCounter + 1000);
true -> count(N + X, ThousandsCounter)
end
end.
I expect this program to work properly, which means: it produces elements, increase of produced elements depends on time like f(t) = kt, k-constant and the more processes I have the faster production is.
ACTUAL QUESTION
I launch program:
erl
c(zad2)
zad2:start(5,5)
How the program behaves:
The longer production lasts the less elements in the unit of time are being produced (e.g. in first second 10000, in next 5000, in 10th second 1000 etc.
The more processes I have, the slower production is, in start(10,10) I need to wait about a second for first thousand, whereas for start(2,2) 20000 appears almost immediately
start(100,100) made me restart my computer (I work on Ubuntu) as the whole CPU was used and there was no memory available for me to open terminal and terminate erlang machine
Why does my program not behave like I expect? Am I doing something wrong with Erlang programming or is this the matter of OS or anything else?
The producer/1 and consumer/1 functions as written above don't ever wait for anything - they just loop and loop, bombarding the server with messages. The server's message queue is filling up very quickly, and the Erlang VM will try to grow as much as it can, stealing all your memory, and the looping processes will steal all available CPU time on all cores.

Creating cursor with erlang rpc with mnesia activity

Hi I'm trying to make a cursor in mnesia from a remote node, for example:
I have the node which is the mnesia database owner and runs important processes in a dedicated server machine, and another node which have a process running in other computer and have to go through all the items to make some simple operations with the data. The thing is that I can make the miscellaneous process work if runs in the same node of mnesia but not remotely.
This is my code that runs locally:
make_cursor_local() ->
QD = qlc:sort(mnesia:table(customer, [{traverse, select}])),
mnesia:activity(async_dirty, fun() -> qlc:cursor(QD) end, mnesia_frag).
get_next_local(Cursor) ->
Get = fun() -> qlc:next_answers(Cursor,100) end,
mnesia:activity(async_dirty, Get, mnesia_frag).
del_cursor_local(Cursor) ->
qlc:delete_cursor(Cursor).
This is my actual code using rpc:
make_cursor() ->
Sort = rpc:call(?NamespaceNode, mnesia, table, [customer, [{traverse, select}]]),
QD = rpc:call(?NamespaceNode, qlc, sort, [Sort]),
CursorCreation = fun() -> qlc:cursor(qlc:sort(Sort)) end,
Cursor = rpc:call(?NamespaceNode, mnesia, activity, [async_dirty, CursorCreation, mnesia_frag]),
Cursor.
get_next(Cursor) ->
Get = fun() -> rpc:call(?NamespaceNode, qlc, next_answers, [Cursor, 100]) end,
Next = rpc:call(?NamespaceNode, mnesia, activity, [async_dirty, Get, mnesia_frag]),
Next.
del_cursor(Cursor) ->
rpc:call(?NamespaceNode, qlc, delete_cursor, [Cursor]).
This code is generating this error making the mnesia activity call in the make_cursor function:
{badrpc,
{'EXIT',
{undef,
[{#Fun<cleaner_app.2.116369932>,[],[]},
{mnesia_tm,non_transaction,5,
[{file,"mnesia_tm.erl"},{line,738}]},
{rpc,'-handle_call_call/6-fun-0-',5,
[{file,"rpc.erl"},{line,205}]}]}}} {badrpc,{'EXIT',{undef,[{#Fun<misc_app.2.116369932>,[],
[]},
{mnesia_tm,non_transaction,5,
[{file,"mnesia_tm.erl"},{line,738}]},
{rpc,'-handle_call_call/6-fun-0-',5,
[{file,"rpc.erl"},{line,205}]}]}}}
I found that the remote node can't execute anonymous functions created in another node so the error posted can be solved this way:
make_cursor() ->
Sort = rpc:call(?NamespaceNode, mnesia, table, [customer, [{traverse, select}]]),
QD = rpc:call(?NamespaceNode, qlc, sort, [Sort]),
rpc:call(?NamespaceNode, mnesia, activity, [sync_transaction, fun qlc:cursor/1, [QD], mnesia_frag]).
I found the answer here what kind of types can be sent on an erlang message?
Now I have to solve the Cursor ownership because I cant execute it from the remote node.
This is the error:
{badrpc,
{'EXIT',
{aborted,
{not_cursor_owner,
[{qlc,next_answers,
[{qlc_cursor,{<6920.991.0>,<6920.990.0>}},100],
[{file,"qlc.erl"},{line,515}]},
{mnesia_tm,apply_fun,3,[{file,"mnesia_tm.erl"},{line,833}]},
{mnesia_tm,execute_transaction,5,
[{file,"mnesia_tm.erl"},{line,813}]},
{mnesia,wrap_trans,6,[{file,"mnesia.erl"},{line,394}]},
{rpc,'-handle_call_call/6-fun-0-',5,
[{file,"rpc.erl"},{line,205}]}]}}}}

Extract values from list of tuples continuously in Erlang

I am learning Erlang from a Ruby background and having some difficulty grasping the thought process. The problem I am trying to solve is the following:
I need to make the same request to an api, each time I receive a unique ID in the response which I need to pass into the next request until there is not ID returned. From each response I need to extract certain data and use it for other things as well.
First get the iterator:
ShardIteratorResponse = kinetic:get_shard_iterator(GetShardIteratorPayload).
{ok,[{<<"ShardIterator">>,
<<"AAAAAAAAAAGU+v0fDvpmu/02z5Q5OJZhPo/tU7fjftFF/H9M7J9niRJB8MIZiB9E1ntZGL90dIj3TW6MUWMUX67NEj4GO89D"...>>}]}
Parse out the shard_iterator..
{_, [{_, ShardIterator}]} = ShardIteratorResponse.
Make the request to kinesis for the streams records...
GetRecordsPayload = [{<<"ShardIterator">>, <<ShardIterator/binary>>}].
[{<<"ShardIterator">>,
<<"AAAAAAAAAAGU+v0fDvpmu/02z5Q5OJZhPo/tU7fjftFF/H9M7J9niRJB8MIZiB9E1ntZGL90dIj3TW6MUWMUX67NEj4GO89DETABlwVV"...>>}]
14> RecordsResponse = kinetic:get_records(GetRecordsPayload).
{ok,[{<<"NextShardIterator">>,
<<"AAAAAAAAAAFy3dnTJYkWr3gq0CGo3hkj1t47ccUS10f5nADQXWkBZaJvVgTMcY+nZ9p4AZCdUYVmr3dmygWjcMdugHLQEg6x"...>>},
{<<"Records">>,
[{[{<<"Data">>,<<"Zmlyc3QgcmVjb3JkISEh">>},
{<<"PartitionKey">>,<<"BlanePartitionKey">>},
{<<"SequenceNumber">>,
<<"49545722516689138064543799042897648239478878787235479554">>}]}]}]}
What I am struggling with is how do I write a loop that keeps hitting the kinesis endpoint for that stream until there are no more shard iterators, aka I want all records. Since I can't re-assign the variables as I would in Ruby.
WARNING: My code might be bugged but it's "close". I've never ran it and don't see how last iterator can look like.
I see you are trying to do your job entirely in shell. It's possible but hard. You can use named function and recursion (since release 17.0 it's easier), for example:
F = fun (ShardIteratorPayload) ->
{_, [{_, ShardIterator}]} = kinetic:get_shard_iterator(ShardIteratorPayload),
FunLoop =
fun Loop(<<>>, Accumulator) -> % no clue how last iterator can look like
lists:reverse(Accumulator);
Loop(ShardIterator, Accumulator) ->
{ok, [{_, NextShardIterator}, {<<"Records">>, Records}]} =
kinetic:get_records([{<<"ShardIterator">>, <<ShardIterator/binary>>}]),
Loop(NextShardIterator, [Records | Accumulator])
end,
FunLoop(ShardIterator, [])
end.
AllRecords = F(GetShardIteratorPayload).
But it's too complicated to type in shell...
It's much easier to code it in modules.
A common pattern in erlang is to spawn another process or processes to fetch your data. To keep it simple you can spawn another process by calling spawn or spawn_link but don't bother with links now and use just spawn/3.
Let's compile simple consumer module:
-module(kinetic_simple_consumer).
-export([start/1]).
start(GetShardIteratorPayload) ->
Pid = spawn(kinetic_simple_fetcher, start, [self(), GetShardIteratorPayload]),
consumer_loop(Pid).
consumer_loop(FetcherPid) ->
receive
{FetcherPid, finished} ->
ok;
{FetcherPid, {records, Records}} ->
consume(Records),
consumer_loop(FetcherPid);
UnexpectedMsg ->
io:format("DROPPING:~n~p~n", [UnexpectedMsg]),
consumer_loop(FetcherPid)
end.
consume(Records) ->
io:format("RECEIVED:~n~p~n",[Records]).
And fetcher:
-module(kinetic_simple_fetcher).
-export([start/2]).
start(ConsumerPid, GetShardIteratorPayload) ->
{ok, [ShardIterator]} = kinetic:get_shard_iterator(GetShardIteratorPayload),
fetcher_loop(ConsumerPid, ShardIterator).
fetcher_loop(ConsumerPid, {_, <<>>}) -> % no clue how last iterator can look like
ConsumerPid ! {self(), finished};
fetcher_loop(ConsumerPid, ShardIterator) ->
{ok, [NextShardIterator, {<<"Records">>, Records}]} =
kinetic:get_records(shard_iterator(ShardIterator)),
ConsumerPid ! {self(), {records, Records}},
fetcher_loop(ConsumerPid, NextShardIterator).
shard_iterator({_, ShardIterator}) ->
[{<<"ShardIterator">>, <<ShardIterator/binary>>}].
As you can see both processes can do their job concurrently.
Try from your shell:
kinetic_simple_consumer:start(GetShardIteratorPayload).
Now your see that your shell process turns to consumer and you will have your shell back after fetcher will send {ItsPid, finished}.
Next time instead of
kinetic_simple_consumer:start(GetShardIteratorPayload).
run:
spawn(kinetic_simple_consumer, start, [GetShardIteratorPayload]).
You should play with spawning processes - it's erlang main strength.
In Erlang, you can write loop using tail recursive functions. I don't know the kinetic API, so for simplicity, I just assume, that kinetic:next_iterator/1 return {ok, NextIterator} or {error, Reason} when there are no more shards.
loop({error, Reason}) ->
ok;
loop({ok, Iterator}) ->
do_something_with(Iterator),
Result = kinetic:next_iterator(Iterator),
loop(Result).
You are replacing loop with iteration. First clause deals with case, where there are no more shards left (always start recursion with the end condition). Second clause deals with case, where we got some iterator, we do something with it and call next.
The recursive call is last instruction in the function body, which is called tail recursion. Erlang optimizes such calls - they don't use call stack, so they can run infinitely in constant memory (you will not get anything like "Stack level too deep")

Exceptions when running Erlang program on different terminal windows

I am developing a simple framework in Erlang to handle 2-player turn-based games. The code is the following:
-module(game).
-export([start_server/0,generate_server/0,add_player/0,remove_player/0]).
generate_server() ->
Table_num = 0,
Player_num = 0,
io:format("Server generated...~n", []),
io:format("The current number of tables is ~w~n", [Table_num]),
io:format("The current number of players is ~w~n", [Player_num]),
receive
login ->
io:format("A new player has connected!~n", []),
New = Player_num + 1,
io:format("The current number of players is ~w~n", [New]);
logout ->
io:format("You have beeen succesfully disconnected~n", [])
end.
start_server() ->
io:format("Welcome player!~nInitializing game...~n", []),
io:format("Generating server...~n", []),
register(server,spawn(game, generate_server, [])).
add_player() ->
server ! login.
remove_player() ->
server ! logout.
There are two main problems when I run this code:
When I execute add_player() and then remove_player(), this second function crashes with an exception
If I launch the program on one terminal window and then execute add_player() on a second terminal windows, I get an error. What should I do to be able to run it on more than one terminal window?
Any help would be highly appreciated.
1/
There is no loop in your server. When you start it, after some print, it waits on the receive statement.
When it receives the login message, it executes the operations, and then the server code is finished; the process die and is unregistered. All variables disappear and the VM clean up the memory...
So, later on, any process sending a message to the server will crash because you are using a name which is no more registered.
To make it works you should keep a list of connected players (in list for example) and recall the server loop with this list as parameter.
generate_server(Tlist,Plist) ->
io:format("The current number of tables is ~w~n", [length(Tlist)]),
io:format("The current number of players is ~w~n", [length(Plist)]),
receive
{login,Name} ->
io:format("A new player ~p has connected!~n", [Name]),
generate_server(Tlist,[Name|Plist]);
{logout,Name} ->
io:format("You have beeen succesfully disconnected~n", []),
generate_server(Tlist,lists:delete(Name,Plist))
end.
and the call to generate_server is done by
register(server,spawn(game, generate_server, [[],[]]))
2/
in order to use erlang messages between 2 different nodes, you need to:
share the same erlang coockie
discover the nodes (using netadm for example)
get the server pid or use golbal registered name
see example at http://learnyousomeerlang.com/distribunomicon#alone-in-the-dark
You missed out a loop body for the server. Your program crashes because the server receives only one message and exits. Consider another version of the server below:
generate_server() ->
Table_num = 0,
Player_num = 0,
io:format("Server generated...~n", []),
io:format("The current number of tables is ~w~n", [Table_num]),
io:format("The current number of players is ~w~n", [Player_num]),
loop([]).
loop(Players)->
receive
{From,{login,PlayerId}} ->
io:format("A new player has connected!~n", []),
io:format("The current number of players is ~w~n", [New]),
NewPlayers = case lists:member(PlayerId,Players) of
true ->
From ! {login_failed,exists},
Players;
false ->
From ! {login_success,true},
[PlayerId|Players]
end,
loop(NewPlayers);
{From,{logout,PlayerId}} ->
NewPlayers = case lists:member(PlayerId,Players) of
true ->
From ! {logout,ok},
Players -- [PlayerId];
false ->
From ! {logout,failed},
Players
end, loop(NewPlayers);
_ -> loop(Players)
end.
There; that looks much better.

Resources