Related
I was playing with the process dictionary inside a gen_server module, i called get() function and i get something like this.
{'$ancestors',[main_server,<0.30.0>]},
{'$initial_call',{child_server,init,1}}]
what happen if i erased the process dictionary, what would go wrong ?
i erased it and every thing worked fine, even
calling a function that generates an exception in the child_server the main_server still can get the exit signal.
$ancestors is used only in the initialization stage, to get the parent's PID, which is used to catch the EXIT message coming from the parent, so that the terminate stuff can get executed. Erasing this key when the server is up and running makes no difference.
$initial_call, on the other hand, is used in the crash report by proc_lib to dump the MFA info.
A quick grep in the OTP source tree can certainly help.
I think some debug functions may use process dictionary, for example erlang:process_info/2
While tracing my modules using dbg, I encountered with the problem how to collect messages such as spawn, exit, register, unregister, link, unlink, getting_linked, getting_unlinked, which are allowed by erlang:trace, but only for those processes which were spawned from my modules directly?
As an examle I don't need to know which processes io module create, when i call io:format in some module function. Does anybody know how to solve this problem?
Short answer:
one way is to look at call messages followed by spawn messages.
Long answer:
I'm not an expert on dbg. The reason is that I've been using an (imho much better, safer and even handier) alternative: pan , from https://gist.github.com/gebi/jungerl/tree/master/lib/pan
The API is summarized in the html doc.
With pan:start you can trace specifying a callback module that receives all the trace messages. Then your callback module can process them, e.g. keep track of processes in ETS or a state data that is passed into every call.
The format of the trace messages is specified under pan:scan.
For examples of callback modules, you may look at src/cb_*.erl.
Now to your question:
With pan you can trace on process handling and calls in your favourit module like this:
pan:start({ip, CallbackModule}, Node, all, [procs,call], {Module}).
where Module is the name of your module (in this case: sptest)
Then the callback module (in this case: cb_write) can look at the spawn messages that follow a call message within the same process, e.g.:
32 - {call,{<6761.194.0>,{'fun',{shell,<node>}}},{sptest,run,[[97,97,97]]},{1332,247999,200771}}
33 - {spawn,{<6761.194.0>,{'fun',{shell,<node>}}},{{<6761.197.0>,{io,fwrite,2}},{io,fwrite,[[77,101,115,115,97,103,101,58,32,126,115,126,110],[[97,97,97]]]}},{1332,247999,200805}}
As pan is also using the same tracing back end as dbg, the trace messages (and the information) can be collected using the Erlang trace BIF-s as well, but pan is much more secure.
In this example, the author avoids a deadlock situation by doing:
self() ! {start_worker_supervisor, Sup, MFA}
in his gen_server's init function. I did something similar in one of my projects and was told this method was frowned upon, and that it was better to cause an immediate timeout instead. What is the accepted pattern?
Update for Erlang 19+
Consider using the new gen_statem behaviour. This behaviour supports generating of events internal to the FSM:
The state function can insert events using the action() next_event and such an event is inserted as the next to present to the state function. That is, as if it is the oldest incoming event. A dedicated event_type() internal can be used for such events making them impossible to mistake for external events.
Inserting an event replaces the trick of calling your own state handling functions that you often would have to resort to in, for example, gen_fsm to force processing an inserted event before others.
Using the action functionality in that module, you can ensure your event is generated in init and always handled before any external events, specifically by creating a next_event action in your init function.
Example:
...
callback_mode() -> state_functions.
init(_Args) ->
{ok, my_state, #data{}, [{next_event, internal, do_the_thing}]}
my_state(internal, do_the_thing, Data) ->
the_thing(),
{keep_state, Data);
my_state({call, From}, Call, Data) ->
...
...
Old answer
When designing a gen_server you generally have the choice to perform actions in three different states:
When starting up, in init/1
When running, in any handle_* function
When stopping, in terminate/2
A good rule of thumb is to execute things in the handling functions when acting upon an event (call, cast, message etc). The stuff that gets executed in init should not wait for events, that's what the handle callbacks are for.
So, in this particular case, a kind of "fake" event is generated. I'd say it seems that the gen_server always wants to initiate the starting of the supervisor. Why not just do it directly in init/1? Is there really a requirement to be able to handle another message in-between (the effect of doing it in handle_info/2 instead)? That windown is so incredibly small (the time between start of the gen_server and the sending of the message to self()) so it's highly unlikely to happen at all.
As for the deadlock, I would really advise against calling your own supervisor in your init function. That's just bad practice. A good design pattern for starting worker process would be one top level supervisor, with a manager and a worker supervisor beneath. The manager starts workers by calling the worker supervisor:
[top_sup]
| \
| \
| \
man [work_sup]
/ | \
/ | \
/ | \
w1 ... wN
Just to complement what has already been said about splitting a servers initialisation into two parts, the first in the init/1 function and the second in either handle_cast/2 or handle_info/2. There is really only one reason to do this and that is if the initialisation is expected to take a long time. Then splitting it up will allow the gen_server:start_link to return faster which can be important for servers started by supervisors as they "hang" while starting their children and one slow starting child can delay the whole supervisor startup.
In this case I don't think it is bad style to split the server initialisation.
It is important to be careful with errors. An error in init/1 will cause the supervisor to terminate while an error in the second part as they will cause the supervisor to try and restart that child.
I personally think it is better style for the server to send a message to itself, either with an explicit ! or a gen_server:cast, as with a good descriptive message, for example init_phase_2, it will be easier to see what is going on, rather than a more anonymous timeout. Especially if timeouts are used elsewhere as well.
Calling your own supervisor sure does seem like a bad idea, but I do something similar all the time.
init(...) ->
gen_server:cast(self(), startup),
{ok, ...}.
handle_cast(startup, State) ->
slow_initialisation_task_reading_from_disk_fetching_data_from_network_etc(),
{noreply, State}.
I think this is clearer than using timeout and handle_info, it's pretty much guaranteed that no message can get ahead of the startup message (no one else has our pid until after we've sent that message), and it doesn't get in the way if I need to use timeouts for something else.
This may be very efficient and simple solution, but I think it is not good erlang style.
I am using timer:apply_after, which is better and does not make impression of interacting with external module/gen_*.
I think that the best way would be to use state machines (gen_fsm). Most of our gen_srvers are really state machine, however because initial work effort to set up get_fsm I think we end up with gen_srv.
To conclude, I would use timer:apply_after to make code clear and efficient or gen_fsm to be pure Erlang style (even faster).
I have just read code snippets, but example itself is somehow broken -- I do not understand this construct of gen_srv manipulating supervisor. Even if it is manager of some pool of future children, this is even more important reason to do it explicitly, without counting on processes' mailbox magic. Debugging this would be also hell in some bigger system.
Frankly, I don't see a point in splitting initialization. Doing heavy lifting in init does hang supervisor, but using timeout/handle_info, sending message to self() or adding init_check to every handler (another possibility, not very convenient though) will effectively hang calling processes. So why do I need "working" supervisor with "not quite working" gen_server? Clean implementation should probably include "not_ready" reply for any message during initialization (why not to spawn full initialization from init + send message back to self() when complete, which would reset "not_ready" status), but then "not ready" reply should be properly processed by the caller and this adds a lot of complexity. Just suspending a reply is not a good idea.
io:format calls in common_test modules don't come to the user console, although error_log messages do. I can't figure out where io:format calls DO go, either. Running ack in my repo on relevant strings turns up nothing. Does anyone know where they go?
Define {logdir, "logs"}. in your spec and then the io:format goes to the log. It is done by setting the group_leader via erlang:group_leader/2 to capture the IO output from your tests.
The actual output is underneath the respective test case in the log output of that test case. logs/index.html is your starting point to peruse.
Finally, it is somewhat loosely documented in http://www.erlang.org/doc/apps/common_test/run_test_chapter.html section 6.9.
Finding dead code in Delphi is usually real simple: just compile and then scan for routines missing their blue dots. The smart linker's very good about tracking them down, most of the time.
Problem is, this doesn't work for event handlers because they're published methods, which (theoretically) could be invoked via RTTI somehow, even though this almost never happens in actual practice.
I'm trying to clean up a large VCL form unit that's been bent, folded, spindled and mutilated various times throughout its history. It would sure be nice if I had some way to find event handlers that aren't actually referenced by the form's DFM and delete them. Is there any easy way to do this? A plug-in IDE Expert, for example?
This is a bit ugly (OK, it's a lot ugly), but for one unit it's close to foolproof, and requires no additional tools:
Make sure that the current version of the form is checked into source control!
Go to the top of the interface of the class where the event handlers are. Delete all of the event handler method interfaces.
Look at Code Explorer/Error Insight. The methods which have implementations but no interfaces will be highlighted. Delete the implementations.
Now save the unit. Delphi will, one at a time, complained about the missing event handler for each event that is actually handled. Write these down as the errors come up.
Check out the original version of the form, and remove the event handlers for anything not on your list.
Use the "Rename Method" refactoring to rename each event handler. Check the "View references before refactoring" checkbox.
Check the Refactoring window. If the event handler is linked to a control, there will be a "VCL Designer Updates" section show which control(s) are linked to the method.
This will also show if the method is called from any other units, or is assigned programmatically.
Note: this is for D2006, may be slightly different in later versions.
ModelMaker Code Explorer contains an so-called Event handler view. It also shows event handlers not connected to any component.
I dont think this is possible from an automatic point of view. event handlers are activated when a particular event occurs inside an object. That the even is not triggered in a given run doesnt mean that there isnt an execution pathway to lead to it.
also you can assign handlers dynamically at runtime so whats used in one situation is not garuanteed.
e.g.
button.onclick := DefaultClickHandler;
button.onClick := SpecialClickHandler;
Assuming that the click handlers match the onclick event signature, but you wouldnt get a compile if the signature was incorrect.
however, you can probably find all the abandoned handlers by looking for all the methods that find have a (Sender: TObject) method signature and comparing that his of methods to those in the .dfm (make sure you save it as text if you are working with an older version of delphi), antyhing not wired up automatically would be suspect in my book.
--
if you dont want to go down the cygwin path, you can load the src and dfm into two TStirngLists and rip out the name/idents from each and generate a list with a couple of loops and some string manipulations. my guess is about 20 minutes of work to get something you can live with .
There is no solution that is guaranteed to give a correct answer in the most general case (based, as you note, on the possibility of calling them via RTTI).
One solution would be to do code coverage tests and look carefully at handlers that were never reached.
I'm not aware of a preexisting app or plugin to do this, but it shouldn't be hard to script.
Assuming you're not using RTTI or manually assigning event handlers: (I'm a C++Builder user rather than Delphi, so the following may not be quite correct.)
Make a list of all published methods in your code.
The proper way to do this is to read *.pas. Find each text block that starts with a class declaration or a published directive and ends with a end, private, or public. Within each of these text blocks, extract each procedure.
The easy way to do this is to make a list of common event handler types and assume they're published.
Once you have this list, print everything from the list that's not found in your DFM file.
I'm most comfortable using Cygwin or Linux tools for scripting. Here's a bash script that works in Cygwin and should do what you want.
#!/bin/bash
for file in `find -name *.pas`; do
echo $file:
# Get a list of common event handling procedures.
# Add more types between the | symbols.
egrep '^[[:space:]]+procedure.*(Click|FormCreate|FormClose|Change|Execute)\(' $file |
awk '{print $2}' | cut -f 1 -d '(' > published.txt
# Get a list of used event procedures.
egrep '^[[:space:]]+On.* =' ${file%.pas}.dfm | awk '{print $3}' > used.txt
# Compare the two.
# Files listed in the left column are published but not used, so you can delete them.
# Files in the right column were not by our crude search for published event
# handlers, so you can update the first egrep command to find them.
comm -3 published.txt used.txt
echo
done
# Clean up.
rm published.txt used.txt
To actually use this, if you're not familiar with Cygwin:
Download and install Cygwin. I think the default install should give you all of the tools I used, but I'm not positive.
Save the script to your source directory as cleanup.sh.
Start a Cygwin command prompt.
If your source is in c:\MyApp, then type cd /cygdrive/c/myapp
Type ./cleanup.sh and press Enter.
There's a much easier approach than Craig's.
Go to a suspect event handler. Rename it in a consistent way--I do this by putting an x in front of the name, go down to the implementation and do the same thing. See what the compiler thinks of it.
If it's not happy you just change the names back.
You can use the same approach to eliminate data elements that no longer do anything.