I'm writing my first distributed erlang application, and I notice that I have to know the node on which I have my "service" up. How can I send requests to my service without knowing on which node it is running?
Basically I want to do something like this:
ReferenceToTheServiceProcess = locate(my_service).
ReferenceToTheServiceProcess ! {request, Stuff}.
Or something else to the equivalent effect (loose coupling).
Thanks!
You could register your service process with a global name, for example using gproc. That way you don't have to know which node your service currently resides on, and your could would pretty much look like you wanted.
You can register the process using the global module.
From your service process call:
global:register_name(my_service, self()).
To send a message to the globally registered process call:
Pid = global:whereis_name(my_service),
Pid ! {request, Stuff}.
or call:
global:send(my_service, {request, Stuff}).
The registration functionality is atomic.
If the service process terminates or the node goes down, the name will be globally unregistered.
Or if you still want local registration of process you could write something like:
where_is_service(Service) ->
lists:filter(fun(X) -> X =/= undefined end, [rpc:call(Node, erlang, whereis, [Service]) || Node <- [node() | nodes()]]).
This function will give back as result the pid of the processes referred locally (on the nodes) as Service
After this if you want just the first process running use the list returned by the function:
send_msg_to_service(Service,Message) ->
case where_is_service(Service) of
[APid | _] -> APid ! Message, ok;
[] -> {error, service_not_running}
end.
Hope this helps!
Related
In Erlang, can I call some function f (BIF or not), whose job is to spawn a process, run the function argf I provided, and doesn't "return" until argf has "returned", and do this without using receive clause (the reason for this is that f will be invoked in a gen_server, I don't want pollute the gen_server's mailbox).
A snippet would look like this:
%% some code omitted ...
F = fun() -> blah, blah, timer:sleep(10000) end,
f(F), %% like `spawn(F), but doesn't return until 10 seconds has passed`
%% ...
The only way to communicate between processes is message passing (of course you can consider to poll for a specific key in an ets or a file but I dont like this).
If you use a spawn_monitor function in f/1 to start the F process and then have a receive block only matching the possible system messages from this monitor:
f(F) ->
{_Pid, MonitorRef} = spawn_monitor(F),
receive
{_Tag, MonitorRef, _Type, _Object, _Info} -> ok
end.
you will not mess your gen_server mailbox. The example is the minimum code, you can add a timeout (fixed or parameter), execute some code on normal or error completion...
You will not "pollute" the gen_servers mailbox if you spawn+wait for message before you return from the call or cast. A more serious problem with this maybe that you will block the gen_server while you are waiting for the other process to terminate. A way around this is to not explicitly wait but return from the call/cast and then when the completion message arrives handle it in handle_info/2 and then do what is necessary.
If the spawning is done in a handle_call and you want to return the "result" of that process then you can delay returning the value to the original call from the handle_info handling the process termination message.
Note that however you do it a gen_server:call has a timeout value, either implicit or explicit, and if no reply is returned it generates an error in the calling process.
Main way to communicate with process in Erlang VM space is message passing with erlang:send/2 or erlang:send/3 functions (alias !). But you can "hack" Erlang and use multiple way for communicating over process.
You can use erlang:link/1 to communicate stat of the process, its mainly used in case of your process is dying or is ended or something is wrong (exception or throw).
You can use erlang:monitor/2, this is similar to erlang:link/1 except the message go directly into process mailbox.
You can also hack Erlang, and use some internal way (shared ETS/DETS/Mnesia tables) or use external methods (database or other things like that). This is clearly not recommended and "destroy" Erlang philosophy... But you can do it.
Its seems your problem can be solved with supervisor behavior. supervisor support many strategies to control supervised process:
one_for_one: If one child process terminates and is to be restarted, only that child process is affected. This is the default restart strategy.
one_for_all: If one child process terminates and is to be restarted, all other child processes are terminated and then all child processes are restarted.
rest_for_one: If one child process terminates and is to be restarted, the 'rest' of the child processes (that is, the child processes after the terminated child process in the start order) are terminated. Then the terminated child process and all child processes after it are restarted.
simple_one_for_one: A simplified one_for_one supervisor, where all child processes are dynamically added instances of the same process type, that is, running the same code.
You can also modify or create your own supervisor strategy from scratch or base on supervisor_bridge.
So, to summarize, you need a process who wait for one or more terminating process. This behavior is supported natively with OTP, but you can also create your own model. For doing that, you need to share status of every started process, using cache or database, or when your process is spawned. Something like that:
Fun = fun
MyFun (ParentProcess, {result, Data})
when is_pid(ParentProcess) ->
ParentProcess ! {self(), Data};
MyFun (ParentProcess, MyData)
when is_pid(ParentProcess) ->
% do something
MyFun(ParentProcess, MyData2) end.
spawn(fun() -> Fun(self(), InitData) end).
EDIT: forgot to add an example without send/receive. I use an ETS table to store every result from lambda function. This ETS table is set when we spawn this process. To get result, we can select data from this table. Note, the key of the row is the process id of the process.
spawner(Ets, Fun, Args)
when is_integer(Ets),
is_function(Fun) ->
spawn(fun() -> Fun(Ets, Args) end).
Fun = fun
F(Ets, {result, Data}) ->
ets:insert(Ets, {self(), Data});
F(Ets, Data) ->
% do something here
Data2 = Data,
F(Ets, Data2) end.
I got a node: app01#mdiaz and I need to know the pid (something like <2908.77.0> )
An Erlang node doesn't have one single pid: there are many processes running on each node, so you need to specify which one you want.
If you want to know the pid of the process registered with the name foo on the node bar#localhost, you can make an RPC call to erlang:whereis/1:
(foo#localhost)1> rpc:call(bar#localhost, erlang, whereis, [foo]).
<7120.56.0>
Though you might not need that: if you want to send a message to a named process on another node, you can use {Name, Node} instead of first getting the pid. For example, to send a message to the process called foo on bar#localhost:
{foo, bar#localhost} ! my_message
You can also go the other direction, getting the node name from a pid, with the node/1 function:
(foo#localhost)1> RemotePid = rpc:call(bar#localhost, erlang, whereis, [foo]).
<6928.32.0>
(foo#localhost)2> node(RemotePid).
bar#localhost
I want to pass some arguments to supervisor:init/1 function and it is desirable, that the application's interface was so:
redis_pool:start() % start all instances
redis_pool:start(Names) % start only given instances
Here is the application:
-module(redis_pool).
-behaviour(application).
...
start() -> % start without params
application:ensure_started(?APP_NAME, transient).
start(Names) -> % start with some params
% I want to pass Names to supervisor init function
% in order to do that I have to bypass application:ensure_started
% which is not GOOD :(
application:load(?APP_NAME),
case start(normal, [Names]) of
{ok, _Pid} -> ok;
{error, {already_started, _Pid}} -> ok
end.
start(_StartType, StartArgs) ->
redis_pool_sup:start_link(StartArgs).
Here is the supervisor:
init([]) ->
{ok, Config} = get_config(),
Names = proplists:get_keys(Config),
init([Names]);
init([Names]) ->
{ok, Config} = get_config(),
PoolSpecs = lists:map(fun(Name) ->
PoolName = pool_utils:name_for(Name),
{[Host, Port, Db], PoolSize} = proplists:get_value(Name, Config),
PoolArgs = [{name, {local, PoolName}},
{worker_module, eredis},
{size, PoolSize},
{max_overflow, 0}],
poolboy:child_spec(PoolName, PoolArgs, [Host, Port, Db])
end, Names),
{ok, {{one_for_one, 10000, 1}, PoolSpecs}}.
As you can see, current implementation is ugly and may be buggy. The question is how I can pass some arguments and start application and supervisor (with params who were given to start/1) ?
One option is to start application and run redis pools in two separate phases.
redis_pool:start(),
redis_pool:run([] | Names).
But what if I want to run supervisor children (redis pool) when my app starts?
Thank you.
The application callback Module:start/2 is not an API to call in order to start the application. It is called when the application is started by application:start/1,2. This means that overloading it to provide differing parameters is probably the wrong thing to do.
In particular, application:start will be called directly if someone adds your application as a dependency of theirs (in the foo.app file). At this point, they have no control over the parameters, since they come from your .app file, in the {mod, {Mod, Args}} term.
Some possible solutions:
Application Configuration File
Require that the parameters be in the application configuration file; you can retrieve them with application:get_env/2,3.
Don't start a supervisor
This means one of two things: becoming a library application (removing the {mod, Mod} term from your .app file) -- you don't need an application behaviour; or starting a dummy supervisor that does nothing.
Then, when someone wants to use your library, they can call an API to create the pool supervisor, and graft it into their supervision tree. This is what poolboy does with poolboy:child_spec.
Or, your application-level supervisor can be a normal supervisor, with no children by default, and you can provide an API to start children of that, via supervisor:start_child. This is (more or less) what cowboy does.
You can pass arguments in the AppDescr argument to application:load/1 (though its a mighty big tuple already...) as {mod, {Module, StartArgs}} according to the docs ("according to the docs" as in, I don't recall doing it this way myself, ever: http://www.erlang.org/doc/apps/kernel/application.html#load-1).
application:load({application, some_app, {mod, {Module, [Stuff]}}})
Without knowing anything about the internals of the application you're starting, its hard to say which way is best, but a common way to do this is to start up the application and then send it a message containing the data you want it to know.
You could make receipt of the message form tell the application to go through a configuration assertion procedure, so that the same message you send on startup is also the same sort of thing you would send it to reconfigure it on the fly. I find this more useful than one-shotting arguments on startup.
In any case, it is usually better to think in terms of starting something, then asking it to do something for you, than to try telling it everything in init parameters. This can be as simple as having it start up and wait for some message that will tell the listener to then spin up the supervisor the way you're trying to here -- isolated one step from the application inclusion issues RL mentioned in his answer.
I'm currently reading Programming Erlang! , at the end of Chapter 13, we want to create a keep-alive process,
the example likes:
on_exit(Pid, Fun) ->
spawn(fun() ->
Ref = monitor(process, Pid),
receive
{'DOWN', Ref, process, Pid, Info} ->
Fun(Info)
end
end).
keep_alive(Name, Fun) ->
register(Name, Pid = spawn(Fun)),
on_exit(Pid, fun(_Why) -> keep_alive(Name, Fun) end).
but when between register/2 and on_exit/2 the process maybe exit, so the monitor will failed, I changed the keep_alive/2 like this:
keep_alive(Name, Fun) ->
{Pid, Ref} = spawn_monitor(Fun),
register(Name, Pid),
receive
{'DOWN', Ref, process, Pid, _Info} ->
keep_alive(Name, Fun)
end.
There also an bug, between spawn_monitor/2 and register/2, the process maybe exit. How could this come to run successfully? Thanks.
I'm not sure that you have a problem that needs solving. Monitor/2 will succeed even if your process exits after register/2. Monitor/2 will send a 'DOWN' message whose Info component will be noproc. Per the documentation:
A 'DOWN' message will be sent to the monitoring process if Item dies, if Item does not exist, or if the connection is lost to the node which Item resides on. (see http://www.erlang.org/doc/man/erlang.html#monitor-2).
So, in your original code
register assocates Name to the Pid
Pid dies
on_exit is called and monitor/2 is executed
monitor immediately sends a 'DOWN' message which is received by the function spawned by on_exit
the Fun(Info) of the received statement is executed calling keep_alive/2
I think all is good.
So why you did't want to use erlang supervisor behaviour? it's provides useful functions for creating and restarting keep-alive processes.
See here the example: http://www.erlang.org/doc/design_principles/sup_princ.html
In your second example, if process exits before registration register will fail with badarg. The easiest way to get around that would be surrounding register with try ... catch and handle error in catch.
You can even leave catch empty, because even if registration failed, the 'DOWN' message, will be sent.
On the other hand, I wouldn't do that in production system. If your worker fails so fast, it is very likely, that the problem is in its initialisation code and I would like to know, that it failed to register and stopped the system. Otherwise, it could fail and be respawned in an endless loop.
I have some process (spawned) with state.
How to maintain simple stateful service in yaws?
How to implement communication to process in "appmods" erl source file?
update:
let's we have simple process
start() -> loop(0).
loop(C) ->
receive
{inc} -> loop(C + 1);
{get, FromPid} -> FromPid ! C, loop(C)
end.
What is the simplest (trivial: without gen_server, yapp) way to access process from web?
Maybe, I need a minimal example with gen_server+yapp+yaws / appmods+yaws.
The #arg structure is a very important datastructure for the yaws programmer.
In the ARG of Yaws out/1 there is a variable that can save user state.
"state, %% State for use by users of the out/1 callback"
You can get detail info here .
There only 2 ways to access a process in Erlang: Either you know its Pid (and the node where you expect the process to be) or You know its registered Name (and the erlang node its expected to be).
Lets say you have your appmod:
-module(myappmod).
-export([out/1]).
-include("PATH/TO/YAWS_SERVER/include/yaws_api.hrl").
-include("PATH/TO/YAWS_SERVER/include/yaws.hrl").
out(Arg) ->
case check_initial_state(Arg) of
unknown -> create_initial_state();
{ok,Value}->
UserPid = list_to_pid(Value),
UserPid ! extract_request(Arg),
receive
Response -> {html,format_response(Response)}
after ?TIMEOUT -> {html,"request_timedout"}
end
end.
check_initial_state(A)->
CookieObject = (A#arg.headers)#headers.cookie,
case yaws_api:find_cookie_val("InitialState", CookieObject) of
[] -> unkown;
Cookie -> {ok,Cookie}
end.
extract_request(Arg)-> %% get request from POST Data or Get Data
Post__data_proplist = yaws_api:parse_post(Arg),
Get_data_proplist = yaws_api:parse_query(Arg),
%% do many other things....
Request = remove_request(Post__data_proplist,Get_data_proplist),
Request.
That simple set up shows you how you would use processes to keep things about a user. However, the use of processes is not good. Processes do fail, so you need a way of recovering what data they were holding.
A better approach is to have a Data storage about your users and have one gen_server to do the look ups. You could use Mnesia. I do not advise you to use processes on the web to keep user state, no matter what kind of app you are doing, even if its a messaging app. Mnesia or ETS tables can keep state and all you need to do is look up.
Use a better storage mechanism to keep state other than processes. Processes are a point of failure. Others use Cookies (and/or Session cookies), whose value is used in some way to look up something from a database. However, if you insist that you need processes, then, have a way of remembering their Pids or registered names. You could store a user Pid into their session cookie e.t.c.