Stop and start Erlang tracer without losing trace events - erlang

I have a question regarding tracers in Erlang, and how these can be switched on and off without losing any trace events. Suppose I have a process P1 which is being traced using the send and receive trace flags, like so:
erlang:trace(P1Pid, true, [set_on_spawn, send, 'receive', {tracer, T1Pid}])
Since the set_on_spawn flag is specified, once a (sub-)process P2 is spawned by P1, the same flags (i.e. set_on_spawn, send, 'receive') will apply to P2 as well. Now suppose I would like to create a new tracer on just P2, such that a tracer T1 handles traces from P1, and tracer T2 handles traces from P2. In order to do so, (since Erlang allows only one tracer per process), I would need to first unset the trace flags (i.e. set_on_spawn, send, 'receive') from P2 (since these are automatically inherited due to the set_on_spawn flag) and set them again on P2, as follows:
% Unset trace flags on P2.
erlang:trace(P2Pid, false, [set_on_spawn, send, 'receive']),
% We might lose trace events at this instant which were raised
% by process P2 while un-setting the tracer on P2 and setting
% it again.
% Now set again trace flags on P2, directing the trace to
% a new tracer T2.
erlang:trace(P2Pid, true, [set_on_spawn, send, 'receive', {tracer, T2Pid}]),
In the lines between setting and un-setting the tracer, a number of trace events which are raised by process P2 might be lost due to a race condition here.
My question is this: can this be achieved without losing trace events?
Does Erlang provide the means by which this 'tracer handover' (i.e. from T1 to T2) can be done in an atomic fashion?
Alternatively, is it possible to pause the Erlang VM and in doing so, pause tracing, thereby avoid losing trace events?

I have looked deeper into the problem and might have found a semi-desirable (see points below) partial work around. After reading the Erlang documentation, I came across the erlang:suspend_process/1 and erlang:resume_process/1 BIFs. Using these two, I can achieve the desired behaviour like so:
% Suspend process P2. According to the Erlang docs, this function
% blocks the caller (i.e. the current tracer) until P2 is suspended.
% This way, we do not lose trace events.
erlang:suspend_process(P2Pid),
% Unset trace flags on P2.
erlang:trace(P2Pid, false, [set_on_spawn, send, 'receive']),
% We should not lose any trace events from P2, since it is
% currently suspended, and therefore cannot generate any.
% However, we can still lose receive trace events that are
% generated as a result of other processes sending messages
% to P2.
% Now set again trace flags on P2, directing the trace to
% a new tracer T2.
erlang:trace(P2Pid, true, [set_on_spawn, send, 'receive', {tracer, T2Pid}]),
% Finally, resume process P2, so that we can receive any trace
% messages generated by P2 on the new tracer T2.
erlang:resume_process(P2Pid).
My only three concerns using this method are the following:
The Erlang documentation for erlang:suspend_process/1 and erlang:resume_process/1 explicitly states that these are to be used for debugging purposes only. My question is why cannot these be used in production when, as illustrated in the example, unless the process P2 is suspended, we face the risk of losing trace events (while switching from tracer T1 to tracer T2)?
We are actually messing around with the system (i.e. we're interfering with its scheduling). Is there a risk associated with this (apart from the fact that one can forget to call erlang:resume_process/1 on a previously suspended process)?
More importantly, even though we can prevent process P2 from taking any action, we cannot prevent other processes from sending messages to P2. These messages will result in {trace, Pid, receive, ...} trace events which might be lost while we are switching traces. Is there a way in which this can be avoided?
NB: A process P that was previously suspended by process P' is automatically resumed if P' (the one that invoked erlang:suspend_process/1) dies.

Related

Trace event message ordering using erlang:trace_delivered/1

I am trying to understand the exact semantics of erlang:trace_delivered/1 in order to determine whether this function can be suitably used to solve a problem I am currently facing. The problem is as follows.
Suppose there is a tracee process X, a tracer Y tracing X, and a third process, Z. Tracer Y is initially tracing X. Process Z is tasked with the responsibility of stopping Y from tracing X by calling erlang:trace(X, false). This call is arbitrarily effected by Z whilst X is running. Thereafter, Z is also to issue a special message stopped to tracer Y which signals to Y that Z has indeed stopped Y from tracing X.
I would like to guarantee that the special stopped message is delivered to Y's mailbox after all other trace messages have been delivered to Y. To my knowledge, Erlang does not guarantee the ordering of messages sent by different processes to a single process. Concretely in my case this means that tracer Y can, for instance, receive the stopped issued by Z before the trace event messages due to tracee X. I read about erlang:trace_delivered/1, and was planning to address my problem using the following code inside the implementation of Z:
...
erlang:trace(X, false),
Ref = erlang:trace_delivered(X),
receive
{trace_delivered, X, Ref} ->
Y ! stopped
end.
...
A somewhat similar example is provided in the docs (quoted below):
Example: Process A is Tracee, port B is tracer, and process C is the port owner of B. C wants to close B when A exits. To ensure that the trace is not truncated, C can call erlang:trace_delivered(A) when A exits, and wait for message {trace_delivered, A, Ref} before closing B.
My example differs from the one in the docs in these two respects:
Process C knows the exact point of execution of A prior to invoking erlang:trace_delivered(A); in my case, I do not know the point X is in when erlang:trace_delivered(X) is invoked by Z.
By contrast to process C, my process Z switches off tracing on X before invoking erlang:trace_delivered(X).
What are the semantics of erlang:trace_delivered/1 in this case?
Does it still guarantee that {trace_delivered, X, Ref} is received by process Z after all trace messages have been delivered to tracer Y?
And does erlang:trace_delivered/1 automatically keep track of the point at which the execution of tracee X is at to be able to provide the aforementioned guarantee?
Your help is greatly appreciated!
You are correct about the ordering of messages. If process A sends multiple messages to process B they will be guaranteed to arrive in order. If A sends multiple messages to processes B and C the message order will only be guaranteed per-process. So for example:
A sends B message 1
A sends C message 2
A sends B message 3
A sends C message 4
The only guarantees in this scenario is that B will receive message 1 before message 3, and C will receive message 2 before message 4.
To answer your questions:
No, if erlang:trace_delivered/1 is invoked prior to other trace events being generated those messages will arrive later. The docs only guarantee that prior trace messages are delivered prior to {trace_delivered, ...}:
When it is guaranteed that all trace messages are delivered to the tracer up to the point that Tracee reached at the time of the call to erlang:trace_delivered(Tracee), then a {trace_delivered, Tracee, Ref} message is sent to the caller of erlang:trace_delivered(Tracee) .
The guarantees around erlang:trace_delivered/1 always hold true. But it doesn't guarantee that {trace_delivered, ...} will always be the last message.

Erlang/OTP How to notify parent process that child processes are idle and no messages in their mailbox

I would like to design a process hierarchy where there is a a parent process P which acts like a gatekeeper and delegates the work(messages/events from its client processes) to it's children processes C1,C2..Cn which collaborate with each other and may send the result back to P. The children processes cannot talk to any process outside, only P.
The challenge is that though P may have multiple messages from its clients, it should accept only one message, delegate the work to C1..Cn and ONLY accept the next message from its clients
when all its children processes are done(or idle) and there are no more messages circulating between C1 to Cn.
P finishes accepting messages from C1..Cn so that it can return the result to its clients
Constraints:
Idle for me is when they are waiting with a receive (blocking) or simply exited.
C1 to Cn are finite state machines. Some or all of them may send messages back to C. Or there may be no messages to be sent back to C. Even if no messages are sent back to C, C has to figure out that all of them are done with no messages between them.
If any of C1 to Cn have been pre-empted, then it is considered busy(this may be obvious but I thought I'll put it here for completion) and C will not receive the next message
Is there an OTP pattern or library which will do this for me (before I hack something?). I know that process_info can let me know if the mailbox of a process are empty and I could keep on checking the children's mailboxes from P but it would be unnecessary polling from P.
EDIT GENERAL: I am trying to implement a reactive variant of Flow Based Programming on the Erlang platform. This has the notion of 'hierarchical processes' or composites which themselves may contain composite processes until we reach some boxes of actual code...I am going to research(looking at monitor,process_info,process_flag) but I wanted to respond to your excellent answers
EDIT RECURSIVE PARENTS: Each of C1 and Cn can themselves be parent/composite processes. If I just spawn processes and let them exit immediately, I'll have to create the chain of Composites everytime as C1..Cn may themselves be composites (which spawn composites..and so on). Finally, when we reach a leaf box(which is not a composite node), they are supposed to be finite state machines.. so I'm not sure of spawning and making them exit quickly if the are FSMs.
EDIT TKOWAL: Since I am trying to create a generic parent/composite process, it does not know 'when' the task ends. All it does is relay the messages it receives from its children to it's siblings with the 'constraint' that it will not accept the next message from its client/siblings until its children are 'done'. The children C1..Cn may send not just one but many messages. I understand from your proposal, that wait_for_task_finish will stop blocking the moment it gets the first message. But more messages may be emitted too by P's children. P should wait for all messages. Also, having a task_end symbol will not work for the same reason(i.e. multiple messages possible from the children)
Given how inexpensive it is to start up Erlang processes, your gatekeeper could start new children for each incoming task, and then wait for them all to exit normally once they complete their work.
But in general, it sounds like you're looking for a process pool. There are a few of these already available, such as poolboy and sidejob. Pools can be harder to get right than you think, so I advise using an existing proven pool implementation before attempting to write your own.
After edits, this became entirely different question, so I am posting new answer.
If you are trying to write Flow Based Programming, then you are probably solving wrong problem. FBP is effective, because almost everything is asynchronous and you start processing next request immediately after you finished with previous one.
So, the answer is - don't wait for children to finish:
In FBP, there is no time dependencies between the components. So if I
have a chunk of data, it should be able to flow from one end of the
diagram to the other regardless of how any other pieces of data are
being handled. In order to program an FBP system, you have to minimize
your dependencies.
source
When creating parent and children, you know all the connections between blocks, so just configure children to send processed data directly to next block. For example: P1 has children C1 and C2. You send message to P1, it delegates it to C1, packet flows couple of times between C1 and C2 and after that, C1 or C2 sends it directly to P2.
Blocks should be stateless. They output should not depend on previous requests, so even if C1 and C2 are processing data from two different requests to P1 - it is OK. There could be situations, where P1 gets data packet D1 and then D2, but will output answers in different order R2 and then R1. It is also OK. You can use Erlang reference to tag messages and then check, which response is from which request.
I don't think, there is ready library for that, but it is really easy to hack, unless I missed something. Your P process should look like this:
ready_for_next_task() ->
receive
{task, Task, CallerPid} ->
send_task_to_workers(Task)
end,
wait_for_task_finish(CallerPid).
wait_for_task_finish(CallerPid) ->
receive
{task_end, Response} ->
CallerPid ! Response
end,
ready_for_next_task().
In wait_for_task_finish/1 you have only one clause for receive, so it will not accept next task, until current one is finished. If you are waiting for multiple responses from workers, you can simply add second clause to receive with some partial response and call wait_for_task_finish/1 recursively.
It is always better to have some indicator, that the processing ended, because you don't have guarantees on message delivery time. This means, that you could check, that all processes currently are waiting for message and think, that they ended processing, but actually, they did not started yet or one of them send message to other and you caught them before the second one had it in message box.
If the processes C1..Cn have only parts of actual work and don't know about the progress, than the gatekeeper P should know how many parts there were, receive all of them one by one and then call ready_for_next_task/1.

Passing variables through exit signals [Erlang]

I'm wondering if it's possible to send variables from a dying process to it's calling process. I have a process A that spawned another process B through spawn_link. B is about to die by calling exit(killed). I can catch this in A through {'EXIT', From, killed}, but I'd like to pass some variables from B to A before it dies. I can do this by sending a message from B to A right before it dies, but I'm wondering if this is a 'bad' thing to do. Because technically I'd be sending two messages from B to A. Right now, what I have looks like this:
B sends a message with values to A
A receives values and re-enters receive loop
B calls exit(killed)
A receives EXIT message and spawns another linked process
The idea is that B should always exist and when it gets killed, it should be 'resurrected' immediately. What seems like a better alternative in my opinion is to have something like exit(killed, [Variables]) and to catch it with {'EXIT', From, killed, [Variables]}. Is this possible? And if so, are there any reasons for not doing it? Having A store values for B when B hasn't even died yet seems like a bad move. I'd have to start implementing atomic actions to prevent problems with two linked processes dying at the same time. It also forces me to keep the variables in my receive loop.
What I mean is, if I could send values directly with the EXIT call, my loop would look like this:
loop() ->
receive ->
{'EXIT', From, killed, Variables} -> % spawn new linked process with variables
end.
But if I first need to receive a message, get into the loop again to then receive the exit message, I would get;
loop(Vars) ->
receive ->
{values, Variables} -> loop(Variables);
{'EXIT', From, killed} -> % spawn new linked process with variables
end.
This means I keep the list of variables long after I don't need them anymore and I need to enter my loop twice for what could be considered one action.
To answer your question directly: the exit reason can be any term, which means it can also be a tuple like exit({killed, Values}), so instead of receiving {'EXIT', From, killed, Values} you would received {'EXIT', From, {killed, Values}}.
But!
The way you are doing it now is not wrong. Its not particularly ugly, either. Sending a message (especially an asynchronous one) isn't some major operation to be minimized as much as possible, and neither is spawning/killing processes. If your way works for you, fine.
But! (again!)
Why are you doing this in the first place? Consider what it is about state that you need to be shuttling between two processes, one of which you are terminating just then? Should this value be a permanent entity held by the spawning process? Should it die with the worker? Should it be a quantity maintained by a third process and asked for as part of the worker's startup (a more general phrasing of what Łukasz Ptaszyński was getting at)?
I don't know the answers to those questions, because I don't know your program, but they are the things I would think about if I was finding it necessary to do this sort of work. In particular, if there is some base value that process A must seed process B with for it to work, and the next version of the base value is dependent on something process B does, then process B should be returning it as a part of its processing, not as a part of its shutdown.
This seems like a minor semantic difference, but its important to think about. You may find that you shouldn't be terminating B at all, or that you really need A to manage a directory for several concurrent B's and they should seed themselves as they move along, or whatever. You might even find that this means A should be spawning B as a synchronous, monitored operation, not an asynchronous linked one, and the whole herd of processes should be spawned as a complex of multiple managed A-B pairs! I don't know the answers in your case, but these are the things that come to mind on reading what you are doing.
I think you can try this method:
main()->
ParentPid = self(),
From = spawn_link(?MODULE, child, [ParentPid]),
receive
{'EXIT', From, Reason} ->
Reason
end.
child(ParentPid) ->
Value = 2*2,
exit(ParentPid, {killed, Value}).
Please read this link about erlang:exit/2

Mnesia: How to lock multiple rows simultaneously so that I can write/read a "consistent" set of of records

HOW I WISH I HAD PHRASED MY QUESTION TO BEGIN WITH
Take a table with 26 keys, a-z and let them have integer values.
Create a process, Ouch, that does two things over and over again
In one transaction, write random values for a, b, and c such that those values always sum to 10
In another transaction, read the values for a, b and c and complain if their values do not sum to 10
If you spin-up even a few of these processes you will see that very quickly a, b and c are in a state where their values do not sum to 10. I believe there is no way to ask mnesia to "lock these 3 records before starting the writes (or reads)", one can only have mnesia lock the records as it gets to them (so to speak) which allows for the set of records' values to violate my "must sum to 10" constraint.
If I am right, solutions to this problem include
lock the entire table before writing (or reading) the set of 3 records -- I hate to lock whole table for 3 recs,
Create a process that keeps track of who is reading or writing which keys and protects bulk operations from anyone else writing or reading until the operation is completed. Of course I would have to make sure that all processes made use of this... crap, I guess this means writing my own AccessMod as the fourth parameter to activity/4 which seems like a non-trivial exercise
Some other thing that I am not smart enough to figure out.
thoughts?
Ok, I'm an ambitious Erlang newbee, so sorry if this is a dumb question, but
I am building an application-specific, in-memory distributed cache and I need to be able to write sets of Key, Value pairs in one transaction and also retrieve sets of values in one transaction. In other words I need to
1) Write 40 key,value pairs into the cache and ensure that no one else can read or write any of these 40 keys during this multi-key write operation; and,
2) Read 40 keys in one operation and get back 40 values knowing that all 40 values have been unchanged from the moment that this read operation started until it ended.
The only way I can think of doing this is to lock the entire table at the beginning of the fetch_keylist([ListOfKeys]) or at the beginning of the write_keylist([KeyValuePairs], but I don't want to do this because I have many processes simultaneously doing their own multi_key reads and writes and I don't want to lock the entire table any time any process needs to read/write a relatively small subset of records.
Help?
Trying to be more clear: I do not think this is just about using vanilla transactions
I think I am asking a more subtle question than this. Imagine that I have a process that, within a transaction, iterates through 10 records, locking them as it goes. Now imagine this process starts but before it iterates to the 3rd record ANOTHER process updates the 3rd record. This will be just fine as far as transactions go because the first process hadn't locked the 3rd record (yet) and the OTHER process modified it and released it before the first process got to it. What I want is to be guaranteed that once my first process starts that no other process can touch the 10 records until the first process is done with them.
PROBLEM SOLVED - I'M AN IDIOT... I guess...
Thank you all for your patients, especially Hynek -Pichi- Vychodil!
I prepared my test code to show the problem, and I could in fact reproduce the problem. I then simplified the code for readability and the problem went away. I was not able to again reproduce the problem. This is both embarrassing and mysterious to me since I had this problem for days. Also mnesia never complained that I was executing operations outside of a transaction and I have no dirty transactions anywhere in my code, I have no idea how I was able to introduce this bug into my code!
I have pounded the notion of Isolation into my head and will not doubt that it exists again.
Thanks for the education.
Actually, turns out the problem was using try/catch around mnesia operations within a transaction. See here for more.
Mnesia transaction will do exactly this thing for you. It is what is transaction for unless you do dirty operations. So just place your write and read operations to one transaction a mnesia will do rest. All operations in one transaction is done as one atomic operation. Mnesia transaction isolation level is what is sometimes known as "serializable" i.e. strongest isolation level.
Edit:
It seems you missed one important point about concurrent processes in Erlang. (To be fair it is not only true in Erlang but in any truly concurrent environment and when someone arguing else it is not really concurrent environment.) You can't distinguish which action happen first and which happen second unless you do some synchronization. Only way you can do this synchronization is using message passing. You have guaranteed only one thing about messages in Erlang, ordering of messages sent from one process to other process. It means when you send two messages M1 and M2 from process A to process B they arrives in same order. But if you send message M1 from A to B and message M2 from C to B they can arrive in any order. Simply because how you can even tell which message you sent first? It is even worse if you send message M1 from A to B and then M2 from A to C and when M2 arrives to C send M3 from C to B you don't have guarantied that M1 arrives to B before M3. Even it will happen in one VM in current implementation. But you can't rely on it because it is not guaranteed and can change even in next version of VM just due message passing implementation between different schedulers.
It illustrates problems of event ordering in concurrent processes. Now back to the mnesia transaction. Mnesia transaction have to be side effect free fun. It means there may not be any message sending outside from transaction. So you can't tell which transaction starts first and when starts. Only thing you can tell if transaction succeed and they order you can only determine by its effect. When you consider this your subtle clarification makes no sense. One transaction will read all keys in atomic operation even it is implemented as reading one key by one in transaction implementation and your write operation will be also performed as atomic operation. You can't tell if write to 4th key in second transaction was happen after you read 1st key in first transaction because there it is not observable from outside. Both transaction will be performed in particular order as separate atomic operation. From outside point of view all keys will be read in same point of time and it is work of mnesia to force it. If you send message from inside of transaction you violate mnesia transaction property and you can't be surprised it will behave strange. To be concrete, this message can be send many times.
Edit2:
If you spin-up even a few of these processes you will see that very
quickly a, b and c are in a state where their values do not sum to 10.
I'm curious why you think it would happen or you tested it? Show me your test case and I will show mine:
-module(transactions).
-export([start/2, sum/0, write/0]).
start(W, R) ->
mnesia:start(),
{atomic, ok} = mnesia:create_table(test, [{ram_copies,[node()]}]),
F = fun() ->
ok = mnesia:write({test, a, 10}),
[ ok = mnesia:write({test, X, 0}) || X <-
[b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z]],
ok
end,
{atomic, ok} = mnesia:transaction(F),
F2 = fun() ->
S = self(),
erlang:send_after(1000, S, show),
[ spawn_link(fun() -> writer(S) end) || _ <- lists:seq(1,W) ],
[ spawn_link(fun() -> reader(S) end) || _ <- lists:seq(1,R) ],
collect(0,0)
end,
spawn(F2).
collect(R, W) ->
receive
read -> collect(R+1, W);
write -> collect(R, W+1);
show ->
erlang:send_after(1000, self(), show),
io:format("R: ~p, W: ~p~n", [R,W]),
collect(R, W)
end.
keys() ->
element(random:uniform(6),
{[a,b,c],[a,c,b],[b,a,c],[b,c,a],[c,a,b],[c,b,a]}).
sum() ->
F = fun() ->
lists:sum([X || K<-keys(), {test, _, X} <- mnesia:read(test, K)])
end,
{atomic, S} = mnesia:transaction(F),
S.
write() ->
F = fun() ->
[A, B ] = L = [ random:uniform(10) || _ <- [1,2] ],
[ok = mnesia:write({test, K, V}) || {K, V} <- lists:zip(keys(),
[10-A-B|L])],
ok
end,
{atomic, ok} = mnesia:transaction(F),
ok.
reader(P) ->
case sum() of
10 ->
P ! read,
reader(P);
_ ->
io:format("ERROR!!!~n",[]),
exit(error)
end.
writer(P) ->
ok = write(),
P ! write,
writer(P).
If it would not work it would be really serious problem. There are serious applications including payment systems which rely on it. If you have test case which shows it is broken, please report bug at erlang-bugs#erlang.org
Have you tried mnesia Events ? You can have the reader subscribe to mnesia's Table Events especially write events so as not to interrupt the process doing the writing. In this way, mnesia just keeps sending a copy of what has been written in real-time to the other process which checks what the values are at any one time. take a look at this:
subscriber()->
mnesia:subscribe({table,YOUR_TABLE_NAME,simple}),
%% OR mnesia:subscribe({table,YOUR_TABLE_NAME,detailed}),
wait_events().
wait_events()->
receive
%% For simple events
{mnesia_table_event,{write, NewRecord, ActivityId}} ->
%% Analyse the written record as you wish
wait_events();
%% For detailed events
{mnesia_table_event,{write, YOUR_TABLE, NewRecord, [OldRecords], ActivityId}} ->
%% Analyse the written record as you wish
wait_events();
_Any -> wait_events()
end.
Now you spawn your analyser as a process like this:
spawn(?MODULE,subscriber,[]).
This makes the whole process to run without any process being blocked, mnesia needs not lock any tabel or record because now what you have is a writer process and an analyser process. The whole thing will run in real-time. Remember that there are many other events that you can make use of if you wish by pattern matching them in the subscriber wait_events() receive body.
Its possible to build a heavy duty gen_server or complete application intended for reception and analysis of all your mnesia events. Its usually better to have one capable subscriber than many failing event subscribers. If i have understood you question well, this unblocking solution fits your requirements.
mnesia:read/3 with write locks seems to be suffient.
Mnesia's transaction is implemented by read-write lock and locks are well-formed (holding lock untill the end of transaction). So the isolation level is serializable.
The granularity of locks are per record as long as you access by primary key.

erlang : ordering of trace messages originating from a single process

That is the simple question, i can not find a clear answer to:
Can one assume that the order of trace messages belonging to a single process are sent in the order in which corresponding events occur ?
(The icing on the cake would of course be the source where is is specified :) )
thank you
Messages from a process A to a process B are guaranteed to always be ordered. It would be right to assume the trace events will also be ordered.
This guarantee doesn't hold when many processes message another one: if A and C both message B and A fires before C, there is no guarantee that A's message will be there first. Similarly, if A messages both B and C, there is no guarantee that C won't have its messages before B.
This could cause confusion if there is IO being done while tracing -- IO goes through a specific process (the group leader) that acts as a server, so outputting trace vs. stuff that is happening right now might give funny results.

Resources