Basic Erlang Question - erlang

I have been trying to learn Erlang and came across some code written by Joe Armstrong:
start() ->
F = fun interact/2,
spawn(fun() -> start(F, 0) end).
interact(Browser, State) ->
receive
{browser, Browser, Str} ->
Str1 = lists:reverse(Str),
Browser ! {send, "out ! " ++ Str1},
interact(Browser, State);
after 100 ->
Browser ! {send, "clock ! tick " ++ integer_to_list(State)},
interact(Browser, State+1)
end.
It is from a blog post about using websockets with Erlang: http://armstrongonsoftware.blogspot.com/2009/12/comet-is-dead-long-live-websockets.html
Could someone please explain to me why in the start function, he spawns the anonymous function start(F, 0), when start is a function that takes zero arguments. I am confused about what he is trying to do here.

Further down in this blog post (Listings) you can see that there is another function (start/2) that takes two arguments:
start(F, State0) ->
{ok, Listen} = gen_tcp:listen(1234, [{packet,0},
{reuseaddr,true},
{active, true}]),
par_connect(Listen, F, State0).
The code sample you quoted was only an excerpt where this function was omitted for simplicity.

The reason for spawning a fun in this way is to avoid having to export a function which is only intended for internal use. One problem with is that all exported functions are available to all users even if they only meant for internal use. One example of this is a call-back module for gen_server which typically contains both the exported API for clients and the call-back functions for the gen_server behaviour. The call-back functions are only intended to be called by the gen_server behaviour and not by others but they are visible in the export list and not in anyway blocked.
Spawning a fun decreases the number of exported internal functions.

In Erlang, functions are identified by their name and their arity (the number of parameters they take). You can have more than one function with the same name, as long as they all have different numbers of parameters. The two functions you've posted above are start/0 and interact/2. start/0 doesn't call itself; instead it calls start/2, and if you take a look further down the page you linked to, you'll find the definition of start/2.
The point of using spawn in this way is to start a server process in the background and return control to the caller. To play with this code, I guess that you'd start up the Erlang interpreter, load the script and then call the start/0 function. This method would then start a process in the background and return so that you could continue to type into the Erlang interpreter.

Related

Erlang: spawn a process and wait for termination without using `receive`

In Erlang, can I call some function f (BIF or not), whose job is to spawn a process, run the function argf I provided, and doesn't "return" until argf has "returned", and do this without using receive clause (the reason for this is that f will be invoked in a gen_server, I don't want pollute the gen_server's mailbox).
A snippet would look like this:
%% some code omitted ...
F = fun() -> blah, blah, timer:sleep(10000) end,
f(F), %% like `spawn(F), but doesn't return until 10 seconds has passed`
%% ...
The only way to communicate between processes is message passing (of course you can consider to poll for a specific key in an ets or a file but I dont like this).
If you use a spawn_monitor function in f/1 to start the F process and then have a receive block only matching the possible system messages from this monitor:
f(F) ->
{_Pid, MonitorRef} = spawn_monitor(F),
receive
{_Tag, MonitorRef, _Type, _Object, _Info} -> ok
end.
you will not mess your gen_server mailbox. The example is the minimum code, you can add a timeout (fixed or parameter), execute some code on normal or error completion...
You will not "pollute" the gen_servers mailbox if you spawn+wait for message before you return from the call or cast. A more serious problem with this maybe that you will block the gen_server while you are waiting for the other process to terminate. A way around this is to not explicitly wait but return from the call/cast and then when the completion message arrives handle it in handle_info/2 and then do what is necessary.
If the spawning is done in a handle_call and you want to return the "result" of that process then you can delay returning the value to the original call from the handle_info handling the process termination message.
Note that however you do it a gen_server:call has a timeout value, either implicit or explicit, and if no reply is returned it generates an error in the calling process.
Main way to communicate with process in Erlang VM space is message passing with erlang:send/2 or erlang:send/3 functions (alias !). But you can "hack" Erlang and use multiple way for communicating over process.
You can use erlang:link/1 to communicate stat of the process, its mainly used in case of your process is dying or is ended or something is wrong (exception or throw).
You can use erlang:monitor/2, this is similar to erlang:link/1 except the message go directly into process mailbox.
You can also hack Erlang, and use some internal way (shared ETS/DETS/Mnesia tables) or use external methods (database or other things like that). This is clearly not recommended and "destroy" Erlang philosophy... But you can do it.
Its seems your problem can be solved with supervisor behavior. supervisor support many strategies to control supervised process:
one_for_one: If one child process terminates and is to be restarted, only that child process is affected. This is the default restart strategy.
one_for_all: If one child process terminates and is to be restarted, all other child processes are terminated and then all child processes are restarted.
rest_for_one: If one child process terminates and is to be restarted, the 'rest' of the child processes (that is, the child processes after the terminated child process in the start order) are terminated. Then the terminated child process and all child processes after it are restarted.
simple_one_for_one: A simplified one_for_one supervisor, where all child processes are dynamically added instances of the same process type, that is, running the same code.
You can also modify or create your own supervisor strategy from scratch or base on supervisor_bridge.
So, to summarize, you need a process who wait for one or more terminating process. This behavior is supported natively with OTP, but you can also create your own model. For doing that, you need to share status of every started process, using cache or database, or when your process is spawned. Something like that:
Fun = fun
MyFun (ParentProcess, {result, Data})
when is_pid(ParentProcess) ->
ParentProcess ! {self(), Data};
MyFun (ParentProcess, MyData)
when is_pid(ParentProcess) ->
% do something
MyFun(ParentProcess, MyData2) end.
spawn(fun() -> Fun(self(), InitData) end).
EDIT: forgot to add an example without send/receive. I use an ETS table to store every result from lambda function. This ETS table is set when we spawn this process. To get result, we can select data from this table. Note, the key of the row is the process id of the process.
spawner(Ets, Fun, Args)
when is_integer(Ets),
is_function(Fun) ->
spawn(fun() -> Fun(Ets, Args) end).
Fun = fun
F(Ets, {result, Data}) ->
ets:insert(Ets, {self(), Data});
F(Ets, Data) ->
% do something here
Data2 = Data,
F(Ets, Data2) end.

Pass some arguments to supervisor init function when app starts

I want to pass some arguments to supervisor:init/1 function and it is desirable, that the application's interface was so:
redis_pool:start() % start all instances
redis_pool:start(Names) % start only given instances
Here is the application:
-module(redis_pool).
-behaviour(application).
...
start() -> % start without params
application:ensure_started(?APP_NAME, transient).
start(Names) -> % start with some params
% I want to pass Names to supervisor init function
% in order to do that I have to bypass application:ensure_started
% which is not GOOD :(
application:load(?APP_NAME),
case start(normal, [Names]) of
{ok, _Pid} -> ok;
{error, {already_started, _Pid}} -> ok
end.
start(_StartType, StartArgs) ->
redis_pool_sup:start_link(StartArgs).
Here is the supervisor:
init([]) ->
{ok, Config} = get_config(),
Names = proplists:get_keys(Config),
init([Names]);
init([Names]) ->
{ok, Config} = get_config(),
PoolSpecs = lists:map(fun(Name) ->
PoolName = pool_utils:name_for(Name),
{[Host, Port, Db], PoolSize} = proplists:get_value(Name, Config),
PoolArgs = [{name, {local, PoolName}},
{worker_module, eredis},
{size, PoolSize},
{max_overflow, 0}],
poolboy:child_spec(PoolName, PoolArgs, [Host, Port, Db])
end, Names),
{ok, {{one_for_one, 10000, 1}, PoolSpecs}}.
As you can see, current implementation is ugly and may be buggy. The question is how I can pass some arguments and start application and supervisor (with params who were given to start/1) ?
One option is to start application and run redis pools in two separate phases.
redis_pool:start(),
redis_pool:run([] | Names).
But what if I want to run supervisor children (redis pool) when my app starts?
Thank you.
The application callback Module:start/2 is not an API to call in order to start the application. It is called when the application is started by application:start/1,2. This means that overloading it to provide differing parameters is probably the wrong thing to do.
In particular, application:start will be called directly if someone adds your application as a dependency of theirs (in the foo.app file). At this point, they have no control over the parameters, since they come from your .app file, in the {mod, {Mod, Args}} term.
Some possible solutions:
Application Configuration File
Require that the parameters be in the application configuration file; you can retrieve them with application:get_env/2,3.
Don't start a supervisor
This means one of two things: becoming a library application (removing the {mod, Mod} term from your .app file) -- you don't need an application behaviour; or starting a dummy supervisor that does nothing.
Then, when someone wants to use your library, they can call an API to create the pool supervisor, and graft it into their supervision tree. This is what poolboy does with poolboy:child_spec.
Or, your application-level supervisor can be a normal supervisor, with no children by default, and you can provide an API to start children of that, via supervisor:start_child. This is (more or less) what cowboy does.
You can pass arguments in the AppDescr argument to application:load/1 (though its a mighty big tuple already...) as {mod, {Module, StartArgs}} according to the docs ("according to the docs" as in, I don't recall doing it this way myself, ever: http://www.erlang.org/doc/apps/kernel/application.html#load-1).
application:load({application, some_app, {mod, {Module, [Stuff]}}})
Without knowing anything about the internals of the application you're starting, its hard to say which way is best, but a common way to do this is to start up the application and then send it a message containing the data you want it to know.
You could make receipt of the message form tell the application to go through a configuration assertion procedure, so that the same message you send on startup is also the same sort of thing you would send it to reconfigure it on the fly. I find this more useful than one-shotting arguments on startup.
In any case, it is usually better to think in terms of starting something, then asking it to do something for you, than to try telling it everything in init parameters. This can be as simple as having it start up and wait for some message that will tell the listener to then spin up the supervisor the way you're trying to here -- isolated one step from the application inclusion issues RL mentioned in his answer.

What kind of types can be sent on an Erlang message?

Mainly I want to know if I can send a function in a message in a distributed Erlang setup.
On Machine 1:
F1 = Fun()-> hey end,
gen_server:call(on_other_machine,F1)
On Machine 2:
handler_call(Function,From,State) ->
{reply,Function(),State)
Does it make sense?
Here's an interesting article about "passing fun's to other Erlang nodes". To resume it briefly:
[...] As you might know, Erlang distribution
works by sending the binary encoding
of terms; and so sending a fun is also
essentially done by encoding it using
erlang:term_to_binary/1; passing the
resulting binary to another node, and
then decoding it again using
erlang:binary_to_term/1.[...]
This is pretty obvious
for most data types; but how does it
work for function objects?
When you encode a fun, what is encoded
is just a reference to the function,
not the function implementation.
[...]
[...]the definition of the function is not passed along; just exactly enough information to recreate the fun at an other node if the module is there.
[...] If the module containing the fun has not yet been loaded, and the target node is running in interactive mode; then the module is attempted loaded using the regular module loading mechanism (contained in the module error_handler); and then it tries to see if a fun with the given id is available in said module. However, this only happens lazily when you try to apply the function.
[...] If you never attempt to apply the function, then nothing bad happens. The fun can be passed to another node (which has the module/fun in question) and then everybody is happy.
Maybe the target node has a module loaded of said name, but perhaps in a different version; which would then be very likely to have a different MD5 checksum, then you get the error badfun if you try to apply it.
I would suggest you to read the whole article, cause it's extremely interesting.
You can send any valid Erlang term. Although you have to be careful when sending funs. Any fun referencing a function inside a module needs that module to exist on the target node to work:
(first#host)9> rpc:call(second#host, erlang, apply,
[fun io:format/1, ["Hey!~n"]]).
Hey!
ok
(first#host)10> mymodule:func("Hey!~n").
5
(first#host)11> rpc:call(second#host, erlang, apply,
[fun mymodule:func/1, ["Hey!~n"]]).
{badrpc,{'EXIT',{undef,[{mymodule,func,["Hey!~n"]},
{rpc,'-handle_call_call/6-fun-0-',5}]}}}
In this example, io exists on both nodes and it works to send a function from io as a fun. However, mymodule exists only on the first node and the fun generates an undef exception when called on the other node.
As for anonymous functions, it seems they can be sent and work as expected.
t1#localhost:
(t1#localhost)7> register(shell, self()).
true
(t1#localhost)10> A = me, receive Fun when is_function(Fun) -> Fun(A) end.
hello me you
ok
t2#localhost:
(t2#localhost)11> B = you.
you
(t2#localhost)12> Fn2 = fun (A) -> io:format("hello ~p ~p~n", [A, B]) end.
#Fun<erl_eval.6.54118792>
(t2#localhost)13> {shell, 't1#localhost'} ! Fn2.
I am adding coverage logic to an app built on riak-core, and the merge of results gathered can be tricky if anonymous functions cannot be used in messages.
Also check out riak_kv/src/riak_kv_coverage_filter.erl
riak_kv might be using it to filter result, I guess.

Erlang error when spawning a process

I start a process as follows
start() ->
register (dist_erlang, spawn(?MODULE, loop, [])),
ok.
But get the following error when trying to run start().
Error in process <0.62.0> with exit value: {undef,[{dist_erlang,loop,[]}]}
The module is called dist_erlang.
What am I doing wrong?
Thanks
Although the question is old, I post what helped me when I was wrestling with the Erlang compiler.
This (incomplete) snippet
-export([start/0]).
start() ->
Ping = spawn(?MODULE, ping, [[]]),
...
ping(State) ->
receive
...
end.
fails with error:
=ERROR REPORT==== 2-Sep-2013::12:17:46 ===
Error in process <0.166.0> with exit value: {undef,[{pingpong,ping,[[]],[]}]}
until you export explicitly ping/1 function. So with this export:
-export([start/0, ping/1]).
it works. I think that the confusion came from some examples from Learn You Some Erlang for great good where the modules sometimes have
-compile(export_all).
which is easy to overlook
Based on your previous question, your loop function takes one parameter, not none. Erlang is looking for loop/0 but can't find it because your function is loop/1.
The third parameter to spawn/3 is a list of parameters to pass to your function, and in the case you've shown the list is empty. Try:
register (dist_erlang, spawn(?MODULE, loop, [[]]))
In this case, the third parameter is a list that contains one element (an empty list).

Erlang - spawning processes and passing arguments

I keep running into this. I want to spawn processes and pass arguments
to them without using the MFA form (module/function/arguments), so
basically without having to export the function I want to spawn with
arguments. I've gotten around this a few times using closures(fun's)
and having the arguments just be bound values outside the fun(that I then reference inside the fun), but its
limiting my code structure... I've looked at the docs and spawn only
has the regular spawn/1 and the spawn/3 form, nothing else...
I understand that code reloading in spawned processes is not possible without the use of the MFA form but the spawned processes are not of the long running nature and finish relatively quickly so that's not an issue (I also want to contain all the code in one module-level function with sub-jobs being placed in funs inside that function).
much appreciated
thanks
actually Richard pointed me in the right direction to take to avoid the issue nicelly (in a reply to the same post I put up on the Erlang GoogleGroups):
http://groups.google.com/group/erlang-programming/browse_thread/thread/1d77a697ec67935a
His answer:
By "using closures", I hope you mean something like this:
Pid = spawn(fun () -> any_function(Any, Number, Of, Arguments) end)
How would that be limiting to your code structure?
/Richard
thank you for promptly commenting you my question. Much appreciated
Short answer: you can't. Spawn (in all it's varying forms) only takes a 0-arity function. Using a closure and bringing in bound variables from the spawning function is the way to go, short of using some sort of shared data store like ETS (which is Monster Overkill).
I've never found using a closure to severely hamper my code structure, though; can you give an example of the problems you're having, and perhaps someone can tidy it up for you?
This is an old question but I believe it can be properly answered with a bit of creativity:
The goal of the question is to
Invoke a function
With the following limits;
No M:F/A formatting
No exporting of the Invoked function
This can be solved in the following;
Using the 1st limitation leads us to the following solution:
run() ->
Module = module,
Function = function,
Args = [arg1, arg2, arg3],
erlang:spawn(Module, Function, Args).
In this solution however, the function is required to be exported.
Using the 2nd limitation (No exporting of the Invoked function) alongside the 1st leads us to the following solution using conventional erlang logic:
run() ->
%% Generate an anonymous fun and execute it
erlang:spawn(fun() -> function(arg1, arg2, arg3) end).
This solution generates Anonymous Funs every execution which may or may not be wanted based on your design due to the extra work that the Garbage Colelctor will need to perform (note that, generally, this will be neglible and issues will potentially only be seen in larger systems).
An alternative way to write the above without generating Anonymous Funs would be to spawn an erlang:apply/2 which can execute functions with given parameters.
By passing a Function Ref. to erlang:apply/2, we can reference a local function and invoke it with the given arguments.
The following implements this solution:
run() ->
%% Function Ref. to a local (non-exported) function
Function = fun function/arity,
Args = [arg1, arg2, arg3],
erlang:spawn(erlang, apply, [Function, Args]).
Edit: This type of solution can be found within the Erlang Src whereby erlang:apply/2 is being called to execute a fun() with args.
%% https://github.com/erlang/otp/blob/71af97853c40d8ac5f499b5f2435082665520642/erts/preloaded/src/erlang.erl#L2888%% Spawn and atomically set up a monitor.
-spec spawn_monitor(Fun) -> {pid(), reference()} when
Fun :: function().
spawn_monitor(F) when erlang:is_function(F, 0) ->
erlang:spawn_opt(erlang,apply,[F,[]],[monitor]);
spawn_monitor(F) ->
erlang:error(badarg, [F]).
first, there is no code and we can't help you a lot, so the best way to control your functions and their args with your spawned processes is to spawn the process with a receive function then you will be in contact with your process across the send and receive method, try:
Pid=spawn(Node, ModuleName, functionThatReceive, [])
%%or just spawn(ModuleName....) if the program is not %%distributed
Pid ! {self(), {M1, f1, A1}},
receive
{Pid, Reply} ->Reply
end,
Pid ! {self(), {M2, f2, A2}},
receive
{Pid, Reply} ->Reply
end,
.......
functionThatReceive() ->
receive
{From, {M1, f1, A1}} ->From ! {self(), doSomething1} ;
{From, {M2, f2, A2}} ->From ! {self(), doSomething2}
end.

Resources