Erlang Hot Code loading fails to keep state - erlang

I'm a beginner at erlang programming. To understand hot code loading better, I used the example from Wikipedia (I added responses to the sending Pid for debugging):
%% A process whose only job is to keep a counter.
%% First version
-module(counter).
-export([start/0, codeswitch/2]).
start() -> loop(0).
loop(Sum) ->
receive
{increment, Count} ->
loop(Sum+Count);
%% modified code, which will be loaded:
% reset ->
% loop(0);
{counter, Pid} ->
Pid ! {counter, Sum},
loop(Sum);
{code_switch, Pid} ->
Pid ! {switch, Sum},
?MODULE:codeswitch(Pid, Sum)
% Force the use of 'codeswitch/2' from the latest MODULE version
end.
codeswitch(FromPid, Sum) ->
FromPid ! {switched, Sum},
loop(Sum).
All is good. I can load the module via c(counter). in the shell, spawn a new process via Pid = spawn(fun counter:start/0). and send messages to the spawned process. When I now add a new pattern to the receive expression reset -> loop(0) and reload the code via c(counter)., everything works as expected, new code is loaded, Sum keeps its incremented value etc.
But when I send the {code_switch, self()} message, Sum gets reset to 0 when loop(Sum) is called (FromPid ! {switched, Sum} in the call to FromPid ! {switched, Sum} still returns correct state).
What am I missing, why does my state go away after the first call to an codeswitched function?
Thanks for your help!
| 18 | Pid ! {counter, self()}.
{counter,<0.49.0>}
| 19 | flush().
Shell got {counter,6}
ok
| 20 | Pid ! {code_switch, self()}.
{code_switch,<0.49.0>}
| 21 | flush().
Shell got {switch,6}
Shell got {switched,6}
ok
| 22 | Pid ! {counter, self()}.
{counter,<0.49.0>}
| 23 | flush().
Shell got {counter,0}
ok
I put io:format("DebugInfo:~p~n", [Sum]) as the first expression in loop. Result is:
12> Pid ! {code_switch, self()}.
DebugInfo:3
{code_switch,<0.33.0>}
DebugInfo:0
13> flush().
Shell got {switch,3}
Shell got {switched,3}
ok
EDIT: I found that when I spawn the process via spawn/3, aka spawn(counter, start, [])., this works. When I spawn the process via spawn/1, aka spawn(fun counter:start/0), this doesn't
work. Is this expected behavior? What am I missing?
Documentation states for spawn/1:
Returns the process identifier of a new process started by the application of Fun to the empty list []. Otherwise works like spawn/3.
EDIT: .... Aaaand after trying to replicate this on an Ubuntu virtual machine (where it didn't happen), I am now also unable to reproduce this (and will test my memory for corruption now..)

This is not the behavior that I am seeing when testing your program:
25> LPid ! {counter, self()}.
{counter,<0.39.0>}
26> flush().
Shell got {counter,6}
ok
27> c(counter).
{ok,counter}
28> LPid ! {counter, self()}.
{counter,<0.39.0>}
29> flush().
Shell got {counter,6}
ok
30> LPid ! {increment, 2}.
{increment,2}
31> LPid ! {counter, self()}.
{counter,<0.39.0>}
32> flush().
Shell got {counter,8}
ok
33> LPid ! {code_switch, self()}.
{code_switch,<0.39.0>}
34> flush().
Shell got {switch,8}
Shell got {switched,8}
ok
35> LPid ! {counter, self()}.
{counter,<0.39.0>}
36> flush().
Shell got {counter,8}
ok
Can you maybe add some logs like io:format("DebugInfo:~p~n", [Sum]). to some of the functions to see what's going on?

Related

Erlang beginnings: moving a function from an escript into OTP

There is a simple implementation of the factorial function in an 'escript' in the Erlang docs. The factorial function is given as:
fac(0) -> 1;
fac(N) -> N * fac(N-1).
That's all fine, I can get this to work, no problem.
I would however like to know how I can implement this same, simple factorial function in an 'OTP way' using rebar3?
Just to be clear, my questions are:
Where does the code go?
How would I call it from the shell?
Could I also run it from the command line like I do via the escript example?
FYI, I have gotten started with rebar3. Here is where I am at:
rebar3 new app factorial
creates a few files but specifically the code is in 3 files in a src directory. I can see that a supervisor is being used, seems fine.
I can interact with this project from the shell:
$ rebar3 shell
1> application:which_applications().
[{factorial,"An OTP application","0.1.0"},
{inets,"INETS CXC 138 49","7.0.3"},
{ssl,"Erlang/OTP SSL application","9.1.1"},
{public_key,"Public key infrastructure","1.6.4"},
{asn1,"The Erlang ASN1 compiler version 5.0.8","5.0.8"},
{crypto,"CRYPTO","4.4"},
{stdlib,"ERTS CXC 138 10","3.7"},
{kernel,"ERTS CXC 138 10","6.2"}]
2> application:stop(factorial).
=INFO REPORT==== 21-Jan-2019::12:42:07.484244 ===
application: factorial
exited: stopped
type: temporary
ok
3> application:start(factorial).
ok
Where does the code go?
To 'call code in the OTP way', you can put it behind a gen_server.
For this simple factorial function, I added a new file factorial.erl within the src directory which is pretty much a standard gen_server skeleton with my factorial function as one of the callbacks:
% factorial.erl
-module(factorial).
-behaviour(gen_server).
-export([start_link/0, stop/0, calc/1]).
<boilerplate gen_server stuff here, like init, etc.>
calc(N) ->
{ok, Result} = gen_server:call(?SERVER, {calc, N}),
{ok, Result}.
handle_call({calc, N}, _From, State) ->
Factorial = factorial(N),
Reply = {ok, Factorial},
{reply, Reply, State};
factorial(0) ->
1;
factorial(N) ->
N * factorial(N-1).
Since my rebar3 new app factorial created a supervisor, I modified the supervisor's init so that it calls my factorial module:
% factorial_sup.erl
<skeleton supervisor stuff here>
init([]) ->
Server = {factorial, {factorial, start_link, []},
permanent, 2000, worker, [factorial]},
Children = [Server],
RestartStrategy = {one_for_one, 0, 1},
{ok, {RestartStrategy, Children}}.
How do I call it from the shell?
$ rebar3 shell
<Enter>
1> factorial:calc(5).
{ok,120}
Since this is running under a supervisor, we can still stop and restart it:
2> application:stop(factorial).
=INFO REPORT==== 22-Jan-2019::13:31:29.243520 ===
application: factorial
exited: stopped
type: temporary
ok
3> factorial:calc(5).
** exception exit: {noproc,{gen_server,call,[factorial,{calc,5}]}}
in function gen_server:call/2 (gen_server.erl, line 215)
in call from factorial:calc/1 (/Users/robert/git/factorial/src/factorial.erl, line 32)
4> application:start(factorial).
ok
5> factorial:calc(5).
{ok,120}
How do I create an executable?
Work in progress :-).

Erlang with multiple client shells - spawned process hangs in 'io' module

working through Joe's book, got stuck on Chapter 12 exercise 1. That exercise is asking one to write a function start(AnAtom,Fun) that would register AnAtom as spawn(Fun). I've decided to try something seemingly easier - took the chapter's finished 'area_server' module, and modified it's start/0 function like this:
start() ->
Pid = spawn(ex1, loop, []),
io:format("Spawned ~p~n",[Pid]),
register(area, Pid).
so in place of a process executing the arbitrary Fun, I am registering the 'loop', which is a function in the area_server module doing all the work:
loop() ->
receive
{From, {rectangle, Width, Ht}} ->
io:format("Computing for rectangle...~n"),
From ! {self(), Width*Ht},
loop();
{From, {square, Side}} ->
io:format("Computing for square...~n"),
From ! {self(), Side*Side},
loop();
{From, Other} ->
io:format("lolwut?~n"),
From ! {self(), {error, Other}},
loop()
end.
It seems to be working just fine:
1> c("ex1.erl").
{ok,ex1}
2> ex1:start().
Spawned <0.68.0>
true
3>
3> area ! {self(), hi}.
lolwut?
{<0.61.0>,hi}
4> flush().
Shell got {<0.68.0>,{error,hi}}
ok
5> area ! {self(), {square, 7}}.
Computing for square...
{<0.61.0>,{square,7}}
6> flush().
Shell got {<0.68.0>,49}
ok
Thing went bad when I've tried to test that multiple processes can talk to the registered "server". (CTRL-G, s, c 2)
I'm in a new shell, running alongside the first - but the moment I send a message from this new shell to my 'area' registered process, something nasty happens - when querying process_info(whereis(area)), process moves from this state:
{current_function,{ex1,loop,0}},
{initial_call,{ex1,loop,0}},
to this one:
{current_function,{io,execute_request,2}},
{initial_call,{ex1,loop,0}},
while the message queue starts to grow, messages not getting processed. Hanging in module io, huh! Something is blocked on the io operations? Apparently the process is moved from my ex1:loop/0 into io:execute_request/2 (whatever that is)... are my silly prints causing the problem?
Your processes are doing what you expect with the exception of handling who has control over STDOUT at what moment. And yes, this can cause weird seeming behaviors in the shell.
So let's try something like this without any IO commands that are implied to go to STDOUT and see what happens. Below is a shell session where I define a loop that accumulates messages until I ask it to send me the messages it has accumulated. We can see from this example (which does not get hung up on who is allowed to talk to the single output resource) that the processes behave as expected.
One thing to take note of is that you do not need multiple shells to talk to or from multiple processes.
Note the return value of flush/0 in the shell -- it is a special shell command that dumps the shell's mailbox to STDOUT.
Eshell V9.0 (abort with ^G)
1> Loop =
1> fun L(History) ->
1> receive
1> halt ->
1> exit(normal);
1> {Sender, history} ->
1> Sender ! History,
1> L([]);
1> Message ->
1> NewHistory = [Message | History],
1> L(NewHistory)
1> end
1> end.
#Fun<erl_eval.30.87737649>
2> {Pid1, Ref1} = spawn_monitor(fun() -> Loop([]) end).
{<0.64.0>,#Ref<0.1663562856.2369257474.102541>}
3> {Pid2, Ref2} = spawn_monitor(fun() -> Loop([]) end).
{<0.66.0>,#Ref<0.1663562856.2369257474.102546>}
4> Pid1 ! "blah".
"blah"
5> Pid1 ! "blee".
"blee"
6> Pid1 ! {self(), history}.
{<0.61.0>,history}
7> flush().
Shell got ["blee","blah"]
ok
8> Pid1 ! "Message from shell 1".
"Message from shell 1"
9> Pid2 ! "Message from shell 1".
"Message from shell 1"
10>
User switch command
--> s
--> j
1 {shell,start,[init]}
2* {shell,start,[]}
--> c 2
Eshell V9.0 (abort with ^G)
1> Shell1_Pid1 = pid(0,64,0).
<0.64.0>
2> Shell1_Pid2 = pid(0,66,0).
<0.66.0>
3> Shell1_Pid1 ! "Message from shell 2".
"Message from shell 2"
4> Shell1_Pid2 ! "Another message from shell 2".
"Another message from shell 2"
5> Shell1_Pid1 ! {self(), history}.
{<0.77.0>,history}
6> flush().
Shell got ["Message from shell 2","Message from shell 1"]
ok
7>
User switch command
--> c 1
11> Pid2 ! {self(), history}.
{<0.61.0>,history}
12> flush().
Shell got ["Another message from shell 2","Message from shell 1"]
ok

Can't start process in erlang node

I have two erlang nodes, node01 is 'vm01#192.168.146.128', node02 is 'vm02#192.168.146.128'. I want to start one process on node01 by using spawn(Node, Mod, Fun, Args) on node02, but I always get useless pid.
Node connection is ok:
(vm02#192.168.146.128)14> net_adm:ping('vm01#192.168.146.128').
pong
Module is in the path of node01 and node02:
(vm01#192.168.146.128)7> m(remote_process).
Module: remote_process
MD5: 99784aa56b4feb2f5feed49314940e50
Compiled: No compile time info available
Object file: /src/remote_process.beam
Compiler options: []
Exports:
init/1
module_info/0
module_info/1
start/0
ok
(vm02#192.168.146.128)20> m(remote_process).
Module: remote_process
MD5: 99784aa56b4feb2f5feed49314940e50
Compiled: No compile time info available
Object file: /src/remote_process.beam
Compiler options: []
Exports:
init/1
module_info/0
module_info/1
start/0
ok
However, the spawn is not successful:
(vm02#192.168.146.128)21> spawn('vm01#192.168.146.128', remote_process, start, []).
I'm on node 'vm01#192.168.146.128'
<9981.89.0>
My pid is <9981.90.0>
(vm01#192.168.146.128)8> whereis(remote_process).
undefined
The process is able to run on local node:
(vm02#192.168.146.128)18> remote_process:start().
I'm on node 'vm02#192.168.146.128'
My pid is <0.108.0>
{ok,<0.108.0>}
(vm02#192.168.146.128)24> whereis(remote_process).
<0.115.0>
But it fails on remote node. Can anyone give me some idea?
Here is the source code remote_process.erl:
-module(remote_process).
-behaviour(supervisor).
-export([start/0, init/1]).
start() ->
{ok, Pid} = supervisor:start_link({global, ?MODULE}, ?MODULE, []),
{ok, Pid}.
init([]) ->
io:format("I'm on node ~p~n", [node()]),
io:format("My pid is ~p~n", [self()]),
{ok, {{one_for_one, 1, 5}, []}}.
You are using a global registration for your process, it is necessary for your purpose. The function to retrieve it is global:whereis_name(remote_process).
Edit : It works if
the 2 nodes are connected (check with nodes())
the process is registered with the global module
the process is still alive
if any of these conditions is not satisfied you will get undefined
Edit 2: start node 1 with : werl -sname p1 and type in the shell :
(p1#W7FRR00423L)1> c(remote_process).
{ok,remote_process}
(p1#W7FRR00423L)2> remote_process:start().
I'm on node p1#W7FRR00423L
My pid is <0.69.0>
{ok,<0.69.0>}
(p1#W7FRR00423L)3> global:whereis_name(remote_process).
<0.69.0>
(p1#W7FRR00423L)4>
then start a second node with werl - sname p2 and type in the shell (it is ok to connect the second node later, the global registration is "updated" when necessary):
(p2#W7FRR00423L)1> net_kernel:connect_node(p1#W7FRR00423L).
true
(p2#W7FRR00423L)2> nodes().
[p1#W7FRR00423L]
(p2#W7FRR00423L)3> global:whereis_name(remote_process).
<7080.69.0>
(p2#W7FRR00423L)4>
(p2#W7FRR00423L)4>
Edit 3:
In your test you are spawning a process P1 on the remote node which executes the function remote_process:start/0.
This function calls supervisor:start_link/3 which basically spawns a new supervisor process P2 and links itself to it. after this, P1 has nothing to do anymore so it dies, causing the linked process P2 to die too and you get an undefined reply to the global:whereis_name call.
In my test, I start the process from the shell of the remote node; the shell does not die after I evaluate remote_process:start/0, so the supervisor process does not die and global:whereis_name find the requested pid.
If you want that the supervisor survive to the call, you need an intermediate process that will be spawned without link, so it will not die with its parent. I give you a small example based on your code:
-module(remote_process).
-behaviour(supervisor).
-export([start/0, init/1,local_spawn/0,remote_start/1]).
remote_start(Node) ->
spawn(Node,?MODULE,local_spawn,[]).
local_spawn() ->
% spawn without link so start_wait_stop will survive to
% the death of local_spawn process
spawn(fun start_wait_stop/0).
start_wait_stop() ->
start(),
receive
stop -> ok
end.
start() ->
io:format("start (~p)~n",[self()]),
{ok, Pid} = supervisor:start_link({global, ?MODULE}, ?MODULE, []),
{ok, Pid}.
init([]) ->
io:format("I'm on node ~p~n", [node()]),
io:format("My pid is ~p~n", [self()]),
{ok, {{one_for_one, 1, 5}, []}}.
in the shell you get in node 1
(p1#W7FRR00423L)1> net_kernel:connect_node(p2#W7FRR00423L).
true
(p1#W7FRR00423L)2> c(remote_process).
{ok,remote_process}
(p1#W7FRR00423L)3> global:whereis_name(remote_process).
undefined
(p1#W7FRR00423L)4> remote_process:remote_start(p2#W7FRR00423L).
<7080.68.0>
start (<7080.69.0>)
I'm on node p2#W7FRR00423L
My pid is <7080.70.0>
(p1#W7FRR00423L)5> global:whereis_name(remote_process).
<7080.70.0>
(p1#W7FRR00423L)6> global:whereis_name(remote_process).
undefined
and in node 2
(p2#W7FRR00423L)1> global:registered_names(). % before step 4
[]
(p2#W7FRR00423L)2> global:registered_names(). % after step 4
[remote_process]
(p2#W7FRR00423L)3> rp(processes()).
[<0.0.0>,<0.1.0>,<0.4.0>,<0.30.0>,<0.31.0>,<0.33.0>,
<0.34.0>,<0.35.0>,<0.36.0>,<0.37.0>,<0.38.0>,<0.39.0>,
<0.40.0>,<0.41.0>,<0.42.0>,<0.43.0>,<0.44.0>,<0.45.0>,
<0.46.0>,<0.47.0>,<0.48.0>,<0.49.0>,<0.50.0>,<0.51.0>,
<0.52.0>,<0.53.0>,<0.54.0>,<0.55.0>,<0.56.0>,<0.57.0>,
<0.58.0>,<0.62.0>,<0.64.0>,<0.69.0>,<0.70.0>]
ok
(p2#W7FRR00423L)4> pid(0,69,0) ! stop. % between steps 5 and 6
stop
(p2#W7FRR00423L)5> global:registered_names().
[]

Erlang supervisor does not restart child

I'm trying to learn about erlang supervisors. I have a simple printer process that prints hello every 3 seconds. I also have a supervisor that must restart the printer process if any exception occurs.
Here is my code:
test.erl:
-module(test).
-export([start_link/0]).
start_link() ->
io:format("started~n"),
Pid = spawn_link(fun() -> loop() end),
{ok, Pid}.
loop() ->
timer:sleep(3000),
io:format("hello~n"),
loop().
test_sup.erl:
-module(test_sup).
-behaviour(supervisor).
-export([start_link/0]).
-export([init/1]).
start_link() ->
supervisor:start_link({local, ?MODULE}, ?MODULE, []).
init(_Args) ->
SupFlags = #{strategy => one_for_one, intensity => 1, period => 5},
ChildSpecs = [#{id => test,
start => {test, start_link, []},
restart => permanent,
shutdown => brutal_kill,
type => worker,
modules => [test]}],
{ok, {SupFlags, ChildSpecs}}.
Now I run this program and start the supervisor using test_sup:start_link(). command and after a few seconds, I raise an exception. Why the supervisor does not restart the printer process?
Here is the shell output:
1> test_sup:start_link().
started
{ok,<0.36.0>}
hello
hello
hello
hello
2> erlang:error(err).
=ERROR REPORT==== 13-Dec-2016::00:57:10 ===
** Generic server test_sup terminating
** Last message in was {'EXIT',<0.34.0>,
{err,
[{erl_eval,do_apply,6,
[{file,"erl_eval.erl"},{line,674}]},
{shell,exprs,7,
[{file,"shell.erl"},{line,686}]},
{shell,eval_exprs,7,
[{file,"shell.erl"},{line,641}]},
{shell,eval_loop,3,
[{file,"shell.erl"},{line,626}]}]}}
** When Server state == {state,
{local,test_sup},
one_for_one,
[{child,<0.37.0>,test,
{test,start_link,[]},
permanent,brutal_kill,worker,
[test]}],
undefined,1,5,[],0,test_sup,[]}
** Reason for termination ==
** {err,[{erl_eval,do_apply,6,[{file,"erl_eval.erl"},{line,674}]},
{shell,exprs,7,[{file,"shell.erl"},{line,686}]},
{shell,eval_exprs,7,[{file,"shell.erl"},{line,641}]},
{shell,eval_loop,3,[{file,"shell.erl"},{line,626}]}]}
** exception error: err
Here's the architecture you've created with your files:
test_sup (supervisor)
^
|
v
test (worker)
Then you start your supervisor by calling start_link() in the shell. This creates another bidirectional link:
shell
^
|
v
test_sup (supervisor)
^
|
v
test (worker)
With a bidirectional link, if either side dies, the other side is killed.
When you run erlang:error, you're causing an error in your shell!
Your shell is linked to your supervisor, so Erlang kills the supervisor in response. By chain reaction, your worker gets killed too.
I think you intended to send the error condition to your worker rather than the shell:
Determine the Pid of your worker: supervisor:which_children
Call erlang:exit(Pid, Reason) on the worker's Pid.
When you execute erlang:error(err)., you are killing the calling process, your shell.
As you have used start_link to start the supervisor, it is also killed, and the loop also.
The shell is automatically restarted (thanks to some supervisor), but nobody restart your test supervisor, which cannot restart the loop.
To make this test you should do:
in module test:
start_link() ->
Pid = spawn_link(fun() -> loop() end),
io:format("started ~p~n",[Pid]),
{ok, Pid}.
you will get a prompt:
started <0,xx,0>
where <0,xx,0> is the loop pid, and in the shell you can call
exit(pid(0,xx,0), err).
to kill the loop only.

Empty Process Mail box in Erlang

when you send a message to the shell process, you can flush all messages out by calling: c:flush().
C:\Windows\System32>erl
Eshell V5.9 (abort with ^G)
1> self() ! josh.
josh
2> self() ! me.
me
3> self() ! you.
you
4> flush().
Shell got josh
Shell got me
Shell got you
ok
5>
In my thinking , this empties the mail box of the shell process.
What is the equivalent way of emptying the mailbox of any erlang process ?
This function should flush all messages from mailbox (in any process where you call it):
flush() ->
receive
_ -> flush()
after
0 -> ok
end.

Resources