Is there any way to lock variable between different process in Erlang - erlang

There's variable in my module, and there's receive method to renew variable value. And multiple process are calling this method simultaneously. I need lock this variable when one process is modifying it. Sample as below
mytest.erl
%%%-------------------------------------------------------------------
-module(mytest).
%% API
-export([start_link/0,display/1,callDisplay/2]).
start_link()->
Pid=spawn(mytest,display,["Hello"]),
Pid.
display(Val) ->
io:format("It started: ~p",[Val]),
NextVal=
receive
{call,Msg}->
NewVal=Val++" "++Msg++" ",
NewVal;
stop->
true
end,
display(NextVal).
callDisplay(Pid,Val)->
Pid!{call,Val}.
Start it
Pid=mytest:start_link().
Two process are calling it in the same time
P1=spawn(mytest,callDisplay,[Pid,"Walter"]),
P2=spawn(mytest,callDisplay,[Pid,"Dave"]).
I hope it can add "Walter", "Dave" one by one like "Hello Walter Dave", however, when there're too many of them running together, some Names(Walter, Dave, etc) will be override.
Because when P1, P2 started the same time, Val both are "Hello". P1 add "Walter" to become "Hello Walter", P2 add "Dave" to become "Hello Dave". P1 saved it firstly to NextVal as "Hello Walter", then P2 saved it to NextVal as "Hello Dave", so result will be "Hello Dave". "Hello Walter" is replaced by "Hello Dave", and "Walter" lost forever.
Is there any way I can lock "Val", so when we add "Walter", "Dave" will waiting till Value setting is done?

Even though it's an old question but it's worth explaining.
From what you said and if I'm correct,
you expect to see
"Hello Walter", and "Hello Dave". However, you're seeing successive names been appended to the former as, "Hello Walter Dave.."
This behavior is normal and to see that let look briefly at Erlang memory model. Erlang process memory is divided into three main parts:
Process Control Block(PCB):
This hold the process pid, registered name,table,states and pointers to messages in the it's queue.
Stack:
This hold function parameters, local variables and function return address.
Private Heap: This hold incoming message compound data like tuple, list and binary(not larger than 64 bytes).
All data in these memory belong to and are private to the owning process.
Stage1:
When Pid=spawn(mytest,display,["Hello"]) is called, the server process is created, then the display function with "Hello" passed as argument is called. Since display/1 is executed in the serve process, the "Hello" argument lives in the server's process stack. Execution of display/1 continues until it reaches the receive clause then block and await message matching your format.
Stage 2:
Now P1 starts, it executes ServerPid ! {call, "Walter"}, then P2 executes ServerPid ! {call, "Dave"}. In both cases, erlang makes a copy of the message and send it to the server's process mailbox (Private Heap). This copied message in the mailbox belongs to the server process not the client's.
Now, when {call, "Walter"} is matched, Msg get bound to "Walter".
From stage1, we know Val is bounded to "Hello", Newval then get bounded to "Val ++ " " ++ Msg" = "Hello Walter".
At this point, P2's message, {call, "Dave"}, is still in the server's mailbox awaiting the next receive clause which will happen in the next recursive call to display/1. NextVal get bound to NewVal and the recursive call to dispaly/1 with "Hello Walter" passed as argument is made. This gives the first print "Hello Walter " which now also lives in the server's process stack.
Now when the receive clause is reach again, P2's message {call, "Dave"} is matched.
Now NewVal and NextVal get bound to "Hello Walter" ++ " " ++ "Dave" = "Hello Walter Dave". This get passed as argument to display/1 as the new Val to print Hello Walter Dave. In a nutshell, this variable is updated on every server loop. It serves the same purpose as the State term in gen_server behavior. In your case, successive client calls just appends the message to this serve state variable. Now to your question,
Is there any way I can lock Val, so when we add "Walter", "Dave" will waiting till Value setting is done?
No. Not by locking. Erlang does not work this way.
There are no process locking constructs as it does not need one.
Data(Variables) are always immutable and private(except large binaries which stays in the Shared Heap) to the process that created it.
Also, it's not the actual message you used in the Pid ! Msg construct that is process by the receiving process. It's it copy. The Val parameter in yourdisplay/1 function is private and belongs to the server process because it lives in it stack memory as every call to display/1 is made by the server process itself. So there is no way any other process can lock not even see that variable.
Yes. By sequential message processing
This is exactly what the server process is doing. Polling one message a time from it queue. When {call, "Walter"} was taken, {call, "Dave"} was waiting in the queue. The reason why you see unexpected greeting is because the you change the server state, the display/1 parameter for the next display/1 call which process {call, "Dave"}

Related

Are erlang:send_after/3 and timer:send_after/3 intended to behave differently?

I wanted to send a message to a process after a delay, and discovered erlang:send_after/4.
When looking at the docs it looked like this is exactly what I wanted:
erlang:send_after(Time, Dest, Msg, Options) -> TimerRef
Starts a timer. When the timer expires, the message Msg is sent to the
process identified by Dest.
However, it doesn't seem to work when the destination is running on another node - it tells me one of the arguments are bad.
1> P = spawn('node#host', module, function, [Arg]).
<10585.83.0>
2> erlang:send_after(1000, P, {123}).
** exception error: bad argument
in function erlang:send_after/3
called as erlang:send_after(1000,<10585.83.0>,{123})
Doing the same thing with timer:send_after/3 appears to work fine:
1> P = spawn('node#host', module, function, [Arg]).
<10101.10.0>
2> timer:send_after(1000, P, {123}).
{ok,{-576458842589535,#Ref<0.1843049418.1937244161.31646>}}
And, the docs for timer:send_after/3 state almost the same thing as the erlang version:
send_after(Time, Pid, Message) -> {ok, TRef} | {error, Reason}
Evaluates Pid ! Message after Time milliseconds.
So the question is, why do these two functions, which on the face of it do the same thing, behave differently? Is erlang:send_after broken, or mis-advertised? Or maybe timer:send_after isn't doing what I think it is?
TL;DR
Your assumption is correct: these are intended to do the same thing, but are implemented differently.
Discussion
Things in the timer module such as timer:send_after/2,3 work through the gen_server that defines that as a service. Like any other service, this one can get overloaded if you assign a really huge number of tasks (timers to track) to it.
erlang:send_after/3,4, on the other hand, is a BIF implemented directly within the runtime and therefore have access to system primitives like the hardware timer. If you have a ton of timers this is definitely the way to go. In most programs you won't notice the difference, though.
There is actually a note about this in the Erlang Efficiency Guide:
3.1 Timer Module
Creating timers using erlang:send_after/3 and erlang:start_timer/3 , is much more efficient than using the timers provided by the timer module in STDLIB. The timer module uses a separate process to manage the timers. That process can easily become overloaded if many processes create and cancel timers frequently (especially when using the SMP emulator).
The functions in the timer module that do not manage timers (such as timer:tc/3 or timer:sleep/1), do not call the timer-server process and are therefore harmless.
A workaround
A workaround to gain the efficiency of the BIF without the same-node restriction is to have a process of your own that does nothing but wait for a message to forward to another node:
-module(foo_forward).
-export([send_after/3, cancel/1]).
% Obviously this is an example only. You would want to write this to
% be compliant with proc_lib, write a proper init/N and integrate with
% OTP. Note that this snippet is missing the OTP service functions.
start() ->
spawn(fun() -> loop(self(), [], none) end).
send_after(Time, Dest, Message) ->
erlang:send_after(Time, self(), {forward, Dest, Message}).
loop(Parent, Debug, State) ->
receive
{forward, Dest, Message} ->
Dest ! Message,
loop(Parent, Debug, State);
{system, From, Request} ->
sys:handle_msg(Request, From, Parent, ?MODULE, Debug, State);
Unexpected ->
ok = log(warning, "Received message: ~tp", [Unexpected]),
loop(Parent, Debug, State)
end.
The above example is a bit shallow, but hopefully it expresses the point. It should be possible to get the efficiency of the BIF erlang:send_after/3,4 but still manage to send messages across nodes as well as give you the freedom to cancel a message using erlang:cancel_timer/1
But why?
The puzzle (and bug) is why erlang:send_after/3,4 does not want to work across nodes. The example you provided above looks a bit odd as the first assignment to P was the Pid <10101.10.0>, but the crashed call was reported as <10585.83.0> -- clearly not the same.
For the moment I do not know why erlang:send_after/3,4 doesn't work, but I can say with confidence that the mechanism of operation between the two is not the same. I'll look into it, but I imagine that the BIF version is actually doing some funny business within the runtime to gain efficiency and as a result signalling the target process by directly updating its mailbox instead of actually sending an Erlang message on the higher Erlang-to-Erlang level.
Maybe it is good that we have both, but this should definitely be clearly marked in the docs, and it evidently is not (I just checked).
There is some difference in timeout order if you have many timers.
The example below shows erlang:send_after does not guarantee order, but
timer:send_after does.
1> A = lists:seq(1,10).
[1,2,3,4,5,6,7,8,9,10]
2> [erlang:send_after(100, self(), X) || X <- A].
...
3> flush().
Shell got 2
Shell got 3
Shell got 4
Shell got 5
Shell got 6
Shell got 7
Shell got 8
Shell got 9
Shell got 10
Shell got 1
ok
4> [timer:send_after(100, self(), X) || X <- A].
...
5> flush().
Shell got 1
Shell got 2
Shell got 3
Shell got 4
Shell got 5
Shell got 6
Shell got 7
Shell got 8
Shell got 9
Shell got 10
ok

In Erlang, passing a message to all elements of a list of pids

I am trying to build a very simple barrier-synchronization server, where the server is initially fed a number of processes that will be communicating with it. When a process is done, it receives a message with that process' Pid, and it keeps a list of every process to do so. When the barrier reaches zero (all processes have sent messages), my server needs to send a message to each of these (I am using [Pid | ProcList] as my list of pids).
I have tried using a helper function to no avail, list comprehensions keep me in an infinite loop, and as such I am looking into how to use lists:foreach to take care of this.
I am fairly new to functional programming, but from what I understand, this foreach needs to take in the list as well as a lambda-calculus function to send a message to each node in the list. Due to the infix nature of "!", I have yet to find a way to do this without causing syntax errors.
How you've made infinite loop in list comprehension? I must say, that's quite challenging. Try this:
Message = % broadcast message goes here
ListOfPids = % list of recipients
[Pid ! Message || Pid <- ListOfPids].
If you want to use foreach, than it takes one argument function as first argument, so need to wrap send first, as it is two argument function.
Message = % broadcast message goes here
ListOfPids = % list of recipients
Fun = fun (Pid) -> Pid ! Message end,
lists:foreach(Fun, ListOfPids).

erlang supervisor best way to handle ibrowse:send_req conn_failed

new to Erlang and just having a bit of trouble getting my head around the new paradigm!
OK, so I have this internal function within an OTP gen_server:
my_func() ->
Result = ibrowse:send_req(?ROOTPAGE,[{"User-Agent",?USERAGENT}],get),
case Result of
{ok, "200", _, Xml} -> %<<do some stuff that won't interest you>>
,ok;
{error,{conn_failed,{error,nxdomain}}} -> <<what the heck do I do here?>>
end.
If I leave out the case for handling the connection failed then I get an exit signal propagated to the supervisor and it gets shut down along with the server.
What I want to happen (at least I think this is what I want to happen) is that on a connection failure I'd like to pause and then retry send_req say 10 times and at that point the supervisor can fail.
If I do something ugly like this...
{error,{conn_failed,{error,nxdomain}}} -> stop()
it shuts down the server process and yes, I get to use my (try 10 times within 10 seconds) restart strategy until it fails, which is also the desired result however the return value from the server to the supervisor is 'ok' when I would really like to return {error,error_but_please_dont_fall_over_mr_supervisor}.
I strongly suspect in this scenario that I'm supposed to handle all the business stuff like retrying failed connections within 'my_func' rather than trying to get the process to stop and then having the supervisor restart it in order to try it again.
Question: what is the 'Erlang way' in this scenario ?
I'm new to erlang too.. but how about something like this?
The code is long just because of the comments. My solution (I hope I've understood correctly your question) will receive the maximum number of attempts and then do a tail-recursive call, that will stop by pattern-matching the max number of attempts with the next one. Uses timer:sleep() to pause to simplify things.
%% #doc Instead of having my_func/0, you have
%% my_func/1, so we can "inject" the max number of
%% attempts. This one will call your tail-recursive
%% one
my_func(MaxAttempts) ->
my_func(MaxAttempts, 0).
%% #doc This one will match when the maximum number
%% of attempts have been reached, terminates the
%% tail recursion.
my_func(MaxAttempts, MaxAttempts) ->
{error, too_many_retries};
%% #doc Here's where we do the work, by having
%% an accumulator that is incremented with each
%% failed attempt.
my_func(MaxAttempts, Counter) ->
io:format("Attempt #~B~n", [Counter]),
% Simulating the error here.
Result = {error,{conn_failed,{error,nxdomain}}},
case Result of
{ok, "200", _, Xml} -> ok;
{error,{conn_failed,{error,nxdomain}}} ->
% Wait, then tail-recursive call.
timer:sleep(1000),
my_func(MaxAttempts, Counter + 1)
end.
EDIT: If this code is in a process which is supervised, I think it's better to have a simple_one_for_one, where you can add dinamically whatever workers you need, this is to avoid delaying initialization due to timeouts (in a one_for_one the workers are started in order, and having sleep's at that point will stop the other processes from initializing).
EDIT2: Added an example shell execution:
1> c(my_func).
my_func.erl:26: Warning: variable 'Xml' is unused
{ok,my_func}
2> my_func:my_func(5).
Attempt #0
Attempt #1
Attempt #2
Attempt #3
Attempt #4
{error,too_many_retries}
With 1s delays between each printed message.

Spawn many processes erlang

I wanna measure the performance to my database by measuring the time taken to do something as the number of processes increase. The intention is to plot a graph of performance vs number of processes after, anyone has an idea how? i am a beginner in elrlang please helo
Assuming your database is mnesia, this should not be hard. one way would be to have a write function and a read function. However, note that there are several Activity access contexts with mnesia. To test write times, you should NOT use the context of transaction because it returns immediately to the calling process, even before a disc write has occured. However, for disc writes, its important that you look at the context called: sync_transaction. Here is an example:
write(Record)->
Fun = fun(R)-> mnesia:write(R) end,
mnesia:activity(sync_transaction,Fun,[Record],mnesia_frag).
The function above will return only when all active replicas of the mnesia table have committed the record onto the data disc file. Hence to test the speed as processes increase, you need to have a record generator,a a process spawner , the write function and finally a timing mechanism. For timing, we have a built in function called: timer:tc/1, timer:tc/2 and timer:tc/3 which returns the exact time it took to execute (completely) a given function. To cut the story short, this is how i would do this:
-module(stress_test).
-compile(export_all).
-define(LIMIT,10000).
-record(book,{
isbn,
title,
price,
version}).
%% ensure this table is {type,bag}
-record(write_time,{
isbn,
num_of_processes,
write_time
}).
%% Assuming table (book) already exists
%% Assuming mnesia running already
start()->
ensure_gproc(),
tv:start(),
spawn_many(?LIMIT).
spawn_many(0)-> ok;
spawn_many(N)->
spawn(?MODULE,process,[]),
spawn_many(N - 1).
process()->
gproc:reg({n, l,guid()},ignored),
timer:apply_interval(timer:seconds(2),?MODULE,write,[]),
receive
<<"stop">> -> exit(normal)
end.
total_processes()->
proplists:get_value(size,ets:info(gproc)) div 3.
ensure_gproc()->
case lists:keymember(gproc,1,application:which_applications()) of
true -> ok;
false -> application:start(gproc)
end.
guid()->
random:seed(now()),
MD5 = erlang:md5(term_to_binary([random:uniform(152629977),{node(), now(), make_ref()}])),
MD5List = lists:nthtail(3, binary_to_list(MD5)),
F = fun(N) -> f("~2.16.0B", [N]) end,
L = [F(N) || N <- MD5List],
lists:flatten(L).
generate_record()->
#book{isbn = guid(),title = guid(),price = guid()}.
write()->
Record = generate_record(),
Fun = fun(R)-> ok = mnesia:write(R),ok end,
%% Here is now the actual write we measure
{Time,ok} = timer:tc(mnesia,activity,[sync_transaction,Fun,[Record],mnesia_frag]),
%% The we save that time, the number of processes
%% at that instant
NoteTime = #write_time{
isbn = Record#book.isbn,
num_of_processes = total_processes(),
write_time = Time
},
mnesia:activity(transaction,Fun,[NoteTime],mnesia_frag).
Now there are dependencies here, especially: gproc download and build it into your erlang lib path from here Download Gproc.To run this, just call: stress_test:start(). The table write_time will help you draw a graph of number of processes against time taken to write. As the number of processes increase from 0 to the upper limit (?LIMIT), we note the time taken to write a given record at the given instant and we also note the number of processes at that time.UPDATE
f(S)-> f(S,[]).
f(S,Args) -> lists:flatten(io_lib:format(S, Args)).
That is the missing function. Apologies.... Remember to study the table write_time, using the application tv, a window is opened in which you can examine the mnesia tables. Use this table to see increasing write times/ or decreasing performance as number of processes increase from time to time. An element i have left out is to note the actual time of the write action using time() which may be important parameter. You may add it in the table definition of the write_time table.
Also look at http://wiki.basho.com/Benchmarking.html
you might look at tsung http://tsung.erlang-projects.org/

Can someone explain the structure of a Pid (Process Identifier) in Erlang?

Can someone explain the structure of a Pid in Erlang?
Pids looks like this: <A.B.C>, e.g. <0.30.0> , but I would like to know what is the meaning of these three "bits": A, B and C.
A seems to be always 0 on a local node, but this value changes when the Pid's owner is located on another node.
Is it possible to directly send a message on a remote node using only the Pid? Something like that: <4568.30.0> ! Message, without having to explicitly specify the name of the registered process and the node name ( {proc_name, Node} ! Message)?
Printed process ids < A.B.C > are composed of 6:
A, the node number (0 is the local
node, an arbitrary number for a remote node)
B, the first 15 bits of the process number (an index into the process table) 7
C, bits 16-18 of the process number (the same process number as B) 7
Internally, the process number is 28 bits wide on the 32 bit emulator. The odd definition of B and C comes from R9B and earlier versions of Erlang in which B was a 15bit process ID and C was a wrap counter incremented when the max process ID was reached and lower IDs were reused.
In the erlang distribution PIDs are a little larger as they include the node atom as well as the other information. (Distributed PID format)
When an internal PID is sent from one node to the other, it's automatically converted to the external/distributed PID form, so what might be <0.10.0> (inet_db) on one node might end up as <2265.10.0> when sent to another node. You can just send to these PIDs as normal.
% get the PID of the user server on OtherNode
RemoteUser = rpc:call(OtherNode, erlang,whereis,[user]),
true = is_pid(RemoteUser),
% send message to remote PID
RemoteUser ! ignore_this,
% print "Hello from <nodename>\n" on the remote node's console.
io:format(RemoteUser, "Hello from ~p~n", [node()]).
For more information see: Internal PID structure,
Node creation information,
Node creation counter interaction with EPMD
If I remember this correctly the format is <nodeid,serial,creation>.
0 is current node much like a computer always has the hostname "localhost" to refer to itself. This is by old memory so it might not be 100% correct tough.
But yes. You could build the pid with list_to_pid/1 for example.
PidString = "<0.39.0>",
list_to_pid(PidString) ! message.
Of course. You just use whatever method you need to use to build your PidString. Probably write a function that generates it and use that instead of PidString like such:
list_to_pid( make_pid_from_term({proc_name, Node}) ) ! message
Process id < A.B.C > is composed of:
A, node id which is not arbitrary but the internal index for that node in dist_entry. (It is actually the atom slot integer for the node name.)
B, process index which refers to the internal index in the proctab, (0 -> MAXPROCS).
C, Serial which increases every time MAXPROCS has been reached.
The creation tag of 2 bits is not displayed in the pid but is used internally and increases every time the node restarts.
The PID refers to a process and a node table. So you can only send a message directly to a PID if it is known in the node from which you do the call.
It is possible that this will work if the node you do the call from already knows about the node on which the process is running.
Apart from what others have said, you may find this simple experiment useful to understand what is going on internally:
1> node().
nonode#nohost
2> term_to_binary(node()).
<<131,100,0,13,110,111,110,111,100,101,64,110,111,104,111,
115,116>>
3> self().
<0.32.0>
4> term_to_binary(self()).
<<131,103,100,0,13,110,111,110,111,100,101,64,110,111,104,
111,115,116,0,0,0,32,0,0,0,0,0>>
So, you can se that the node name is internally stored in the pid. More info in this section of Learn You Some Erlang.

Resources