Getting PID for a process running on a remote node - erlang

I am new to Erlang and we are working on a small scale messaging server.
We are trying to build a process registry using Redis ( not planning to use existing once grpoc, global etc due to other business needs ) as a datastore ( store to hold user-id to "node | PID " mapping). When a process starts it register itself with Redis in the form user_id (key) and {node | pid } as value. ( pid is tored as string in redis)
example value inserted in redis are "user_abc", {one#mf, "0.37.0>"}
Now when I try to find PID for "user_abc" which is running in the cluster - i get {node and pid} as value on a remote node which in this case is {one#mf, "0.37.0>".
Question is how do we use {node, pid} details on remote node to connect to the process user_abc.
Thanks in advance for your help.

You can get a "cluster wide" pid by parsing that pid on the remote node:
On node a:
(a#host)1> pid_to_list(self()).
"<0.39.0>"
On node b:
(b#host)1> Pid = rpc:call('a#host', erlang, list_to_pid, ["<0.39.0>"]).
<7101.99.0>
(b#host)2> Pid ! my_test.
my_test
On node a:
(a#host)2> flush().
Shell got my_test
ok
Note that the process might not be alive, so calling erlang:monitor/2 on it before talking to it might be a good idea.

If you want to store erlang pid to redis, and want other node read it as a remote pid, use erlang:term_to_binary(Pid) to store, and use erlang:bianry_to_pid(PidBin) to read.
See this post:Erlang Pid

Related

Erlang how to start an external script in linux

I want to run an external script and get the PID of the process (once it starts) from my erlang program. Later, I will want to send TERM signal to that PID from erlang code. How do I do it?
I tried this
P = os:cmd("myscript &"),
io:format("Pid = ~s ~n",[P]).
It starts the script in background as expected, but I dont get the PID.
Update
I made the below script (loop.pl) for testing:
while(1){
sleep 1;
}
Then tried to spawn the script using open_port. The script runs OK. But, erlang:port_info/2 troughs exception:
2> Port = open_port({spawn, "perl loop.pl"}, []).
#Port<0.504>
3> {os_pid, OsPid} = erlang:port_info(Port, os_pid).
** exception error: bad argument
in function erlang:port_info/2
called as erlang:port_info(#Port<0.504>,os_pid)
I checked the script is running:
$ ps -ef | grep loop.pl
root 10357 10130 0 17:35 ? 00:00:00 perl loop.pl
You can open a port using spawn or spawn_executable, and then use erlang:port_info/2 to get its OS process ID:
1> Port = open_port({spawn, "myscript"}, PortOptions).
#Port<0.530>
2> {os_pid, OsPid} = erlang:port_info(Port, os_pid).
{os_pid,91270}
3> os:cmd("kill " ++ integer_to_list(OsPid)).
[]
Set PortOptions as appropriate for your use case.
As the last line above shows, you can use os:cmd/1 to kill the process if you wish.

Query regarding Zookeeper Windows API start/stop, using Zk as a windows service(using prunsrv)

I am using zookeeper in my product(3.3.3).
While working with zookeeper on Windows, I am creating a service(using prunsrv) ,
I have few queries and issues. Listed them all,
Issues:
1) zkServer.cmd didn’t start on Win server 2008 machine & Win 7 Enterprise(64 bit both), had to replace the following line,
java "-Dzookeeper.log.dir=%ZOO_LOG_DIR%" "-Dzookeeper.root.logger=%ZOO_LOG4J_PROP%" -cp "%CLASSPATH%" %ZOOMAIN% "%ZOOCFG%" %*
to
java "-Dzookeeper.log.dir=%ZOO_LOG_DIR%" "-Dzookeeper.root.logger=%ZOO_LOG4J_PROP%" -cp "%CLASSPATH%" %ZOOMAIN% "%ZOOCFG%"
And it worked, could it be fixed in some other way?
2) In the zoo.cnf I specified the dataDir, still it creates some other directory (bin/zookeeper-3.4.5zookeeper-3.4.5data/ version-2/snapshot) and stores the snapshots there.
Queries:
1) There is no start/stop with zkServer.cmd as it is in zkServer.sh, so basically it is started with zkServer.cmd but to stop I do a Ctrl+C/Z
So if I start the process, it is a foreground process and gets killed when I do a ctrl+C
2) I have to create a zookeeper service, and I am using prunsrv to do that. I figured out the following 2 ways to do so.
a)
prunsrv //IS//Zookeeper --DisplayName=" ZOOKEEPER Service" --Description=" ZOOKEEPER Service" --Startup=auto --StartMode=exe --StartPath=%ZOOKEEPER_HOME% --StartImage=%ZOOKEEPER_HOME%\bin\zkServer.cmd --StopTimeout=5 --LogPath=%LOGS_DIR% --LogPrefix=zookeeper --LogLevel=Info --PidFile=zookeeper.pid --StdOutput=auto --StdError=auto
b)
cd %ZOOKEEPER_HOME%\bin\
call "%~dp0zkEnv.cmd"
set ZOOMAIN=org.apache.zookeeper.server.quorum.QuorumPeerMain
prunsrv //IS//Zookeeper --DisplayName=" ZOOKEEPER Service" --Description=" ZOOKEEPER Service" --Jvm="%JVM_DLL%" --JvmOptions=!JAVA_OPTS! --Environment=zookeeper.log.dir=%ZOO_LOG_DIR%;zookeeper.root.logger=%ZOO_LOG4J_PROP%; --Startup=auto --LibraryPath=%LIB_DIR% --StartMode=jvm --Classpath=%CLASSPATH% %ZOOMAIN% %ZOOCFG% --StartClass=org.apache.zookeeper.server.quorum.QuorumPeerMain --StartMethod=start --StopMode=jvm --StopClass=org.apache.zookeeper.server.quorum.QuorumPeerMain --StopMethod=stop --StopTimeout=10 --LogPath=%LOGS_DIR% --LogPrefix=zookeeper --LogLevel=Info --PidFile=zookeeper.pid --StdOutput=auto --StdError=auto
basically in the second approach I am myself doing all tasks done by the zkServer.cmd
=>> My Query is in the second step(2b), that to stop the service there should be a stop method exposed, so that when I stop the service it is called.
So right now if I create a service and start it, ZK runs fine, but stopping it takes indefinitely, so I have to go and kill the process.
Is there some stop() for the same, I see a shutdown() but there is no description for it
I went through the class org.apache.zookeeper.server.quorum.QuorumPeerMain, here the main() is the start method( if my understanding is correct), and there should be some method to shutdown the process.
Just got the following link
https://issues.apache.org/jira/browse/ZOOKEEPER-1122, exposes a start/stop, but the stop has some issues
it throws the following error:
E:\zookeeper-3.4.5\zookeeper-3.4.5\bin>zkServer.cmd stop
"JMX enabled by default"
"Using config: E:\zookeeper-3.4.5\zookeeper-3.4.5\bin\..\conf\zoo.cfg"
"Stopping zookeeper ... "
ERROR: The process with PID 452 (child process of PID 4) could not be terminated.
Reason: This is critical system process. Taskkill cannot end this process.
ERROR: The process with PID 4 (child process of PID 0) could not be terminated.
Reason: Access is denied.
ERROR: The process with PID 0 (child process of PID 0) could not be terminated.
Reason: This is critical system process. Taskkill cannot end this process.
STOPED
I am running this stop command on a Administrator console.
E:\zookeeper-3.4.5\zookeeper-3.4.5\bin>tasklist | findstr "java"
java.exe 10324 Console 1 36,036 K.
Any help would be highly appreciated
So what you want to do is run Zookeeper as a Windows service. I had the same requirement, here is the solution I have chosen:
Prereqs: Zookeeper & prunsrv
Set ZOOKEEPER_SERVICE environment variable to the name of the windows service to create, and ZOOKEEPER_HOME to the path to the zookeeper home folder. Then,
prunsrv.exe "//IS//%ZOOKEEPER_SERVICE%" ^
--DisplayName="Zookeeper (%ZOOKEEPER_SERVICE%)" ^
--Description="Zookeeper (%ZOOKEEPER_SERVICE%)" ^
--Startup=auto --StartMode=exe ^
--StartPath=%ZOOKEEPER_HOME% ^
--StartImage=%ZOOKEEPER_HOME%\bin\zkServer.cmd ^
--StopPath=%ZOOKEEPER_HOME%\ ^
--StopImage=%ZOOKEEPER_HOME%\bin\zkServerStop.cmd ^
--StopMode=exe --StopTimeout=5 ^
--LogPath=%ZOOKEEPER_HOME% --LogPrefix=zookeeper-wrapper ^
--PidFile=zookeeper.pid --LogLevel=Info --StdOutput=auto --StdError=auto
Add a zkServerStop.cmd file in zookeeper bin folder with the following content:
#echo off
setlocal
TASKLIST /svc | findstr /c:"%ZOOKEEPER_SERVICE%" > %ZOOKEEPER_HOME%\zookeeper_svc.pid
FOR /F "tokens=2 delims= " %%G IN (%ZOOKEEPER_HOME%\zookeeper_svc.pid) DO (
#set zkPID=%%G
)
taskkill /PID %zkPID% /T /F
del %ZOOKEEPER_HOME%/zookeeper_svc.pid
endlocal
(Of course it needs the two environment variables ZOOKEEPER_HOME & ZOOKEEPER_SERVICE to be set)
Hope it helps,
Guillaume.

how to exit remote pid, given node name and registered name?

I found that erlang:exit/2 only accept pid as parameter.
Give node name and node's local registered process name, how to exit pid?
X = rpc:call(Node, erlang, whereis, [RegisteredName]) will provide you with the process id of the remote process and you can now doe exit(X, die_please) or wharever reason you want to use.
Here is my way in windows
start "observer:start" erl -name a#a -setcookie xxxxxxx
net_adm:ping('xxx#xxxxx').
observer:start().
Then you will know how to do.

BigCouch cluster connection issue

I have successfully setuped BigCouch on two different machines. Both of them run locally very well. When I joins them in a cluster using one of or both this command: curl -X PUT machine1:5986/nodes/bigcouch#machine2 -d {} curl -X PUT machine2:5986/nodes/bigcouch#machine1 -d {}
I always receive positive results. The database nodes contains two documents bigcouch#machine2, bigcouch#machine1. But in fact, it is always erreous. I saw this error message in the command line of BigCouch
=*ERROR REPORT==== 9-Dec-2011::20:01:40 === Error in process <0.3117.0> on node 'bigcouch#machine1.fr' with exit value: {{rexi_DOWN,noconnect},[{mem3_rep,rexi_call,2},{mem3_rep,replicate_batch,1},{mem3_rep,go,3},{mem3_rep,go,2}]} <148>1 2011-12-09T19:01:40.559992Z machine1 twig <0.159.0> -------- - mem3_sync nodes -> 'bigcouch#machine2' {{rexi_DOWN,noconnect}, [{mem3_rep,rexi_call,2}, {mem3_rep,replicate_batch,1}, {mem3_rep,go,3}, {mem3_rep,go,2}]} <148>1 2011-12-09T19:01:40.560106Z machine1 twig <0.159.0> -------- - mem3_sync dbs -> 'bigcouch#machine2' {{rexi_DOWN,noconnect}, [{mem3_rep,rexi_call,2}, {mem3_rep,replicate_batch,1}, {mem3_rep,go,3}, {mem3_rep,go,2}]} <148>1 2011-12-09T19:01:40.560205Z machine1 twig <0.159.0> -------- - mem3_sync _users -> 'bigcouch#machine2' {{rexi_DOWN,noconnect}, [{mem3_rep,rexi_call,2}, {mem3_rep,replicate_batch,1}, {mem3_rep,go,3}, {mem3_rep,go,2}]} [error] [emulator] [--------] Error in process <0.3198.0> on node 'bigcouch#machine2' with exit value: {{rexi_DOWN,noconnect},[{mem3_rep,rexi_call,2},{mem3_rep,replicate_batch,1},{mem3_rep,go,3},{mem3_rep,go,2}]} <147>1 2011-12-09T19:01:45.560979Z machine1 twig emulator msg - Error in process <0.3198.0> on node 'bigcouch#machine1' with exit value: {{rexi_DOWN,noconnect},[{mem3_rep,rexi_call,2},{mem3_rep,replicate_batch,1},{mem3_rep,go,3},{mem3_rep,go,2}]}*
Maybe it's the firewalled? If Yes, plese tell me the range port to let nodes connect each other. If not, Please explain it to me and how to solve it to connect them.
In the document, they ask that nodes can ping each other and the nodes set the same magic cookie. My machines can ping each other, but what is magic cookie?
Occasionally you can see this error when a node is first connected as there are various processes that receive update messages and monitor the other nodes as well as an internal replicator. These messages are harmless but if you see "noconnect" persistently then something is wrong.
On each instance there is a file, /etc/vm.args in which you will see two values of interest, -name and -setcookie The first -name corresponds to the doc id you must use when connecting the nodes and the second is the magic cookie that must be the same on all the erlang nodes for them to talk to one another. If this cookie isn't set it defaults to the value in ~/.erlang-cookie
When you execute "make dev" it will build a 3 node cluster that you can inspect to see how these bits should be set.
Also you only need to run the connect on one side, .eg. node2 to node1 as the internal replicator will sync the nodes dbs across the cluster

How to create a daemon program with erlang? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I prepare to develop one heartbeat program, which need to send udp packet every 5s.
How to sleep 5s in erlang or is there sleep(5) function to be used?
How to make it run in background?
If you want your application to send a udp packet I would recommend you to start with a gen_server(coz. you will obviously be having need to add other functionalities to your application).
1. For sending packets at regular interval.
timer:send_interval(5000,interval),
This will call "handle_call(interval,State)" callback of gen_server every 5 seconds from where you can send your packets
2. Making it run in background.
As already posted use "run_erl". I have used this myself to run my application successfully as a daemon.
run_erl -daemon /tmp "erl"
This will create two pipes "erlang.pipe.1.r" and "erlang.pipe.1.w" under "/tmp" dir of unix and you can write commands to write pipe for starting your application using perl or any scripting lang or even c/c++ :)
Recently I have been learning the erlang programming language. One task I gave myself was to write a linux daemon.
As you probably already know, daemons are used to run unix services. Services commonly controlled by daemons include database servers, web servers, web proxies etc. In this example the server is very simple, the client calls the function "say_hi" and the server responds with "hello".
In the linux environment daemons are controlled by scripts that are stored in places such as /etc/init.d. These scripts respond according to convention to the commands start, stop and restart.
Let us start with the shell script:
#!/bin/sh
EBIN=$HOME/Documents/Erlang/Daemon
ERL=/usr/bin/erl
case $1 in
start|stop|restart)
$ERL -detached -sname mynode \
-run daemon shell_do $1 >> daemon2.log
;;
*)
echo "Usage: $0 {start|stop|restart}"
exit 1
esac
exit 0
This has to be one of the simplest shell scripts that you have ever seen. Daemon respond to three different commands, stop, start and restart. In this script the command is simply passed through to the daemon. One improvement would be to exit with the return code from the daemon execution.
So how about the daemon? Here it is...
%% PURPOSE
%% Author: Tony Wallace
%%
%% Manage an erlang daemon process as controlled by a shell scripts
%% Allow standard daemon control verbs
%% Start - Starts a daemon in detached mode and exits
%% Stop - Attaches to the daemon, monitors it, sends an EXIT message and waits for it to die
%% Restart - Calls stop and then start
%% Log events
%% Return UNIX compatible codes for functions called from shell scripts
%% Exit shell script calls so as to not stop the scripts from completing
%% Shell scripts expected to use shell_do to execute functions
%%
%% Allow interaction with daemon from other erlang nodes.
%% Erlang processes are expected to call functions directly rather than through shell_do
%%
%% MOTIVATION
%% Erlang is great, but as an application it needs to be managed by system scripts.
%% This is particularly for process that are expected to be running without user initiation.
%%
%% INVOCATION
%% See daemon.sh for details of calling this module from a shell script.
%%
%% TO DO
%% Define and use error handler for spawn call.
-module(daemon).
%-compile([{debug_info}]).
-export [start/0,start/1,stop_daemon/0,say_hi/0,kill/0,shell_do/1].
%%-define (DAEMON_NAME,daemon#blessing).
-define (DAEMON_NAME,list_to_atom("daemon#"++net_adm:localhost())).
-define (UNIX_OKAY_RESULT,0).
-define (TIMEOUT_STARTING_VM,1).
-define (VM_STARTED_WITHOUT_NAME,2).
-define (INVALID_VERB,3).
-define (COULD_NOT_CONNECT,4).
-define (TIMEOUT_WAITING_QUIT,5).
-define (TIMEOUT_STOPPING_VM,6).
wait_vm_start(_,0) -> ?TIMEOUT_STARTING_VM;
wait_vm_start(D,N) ->
net_kernel:connect(D),
Dl = lists:filter(fun(X) -> X==D end,nodes()),
if Dl =:= [] ->
receive after 1000 -> true end,
wait_vm_start(D,N-1);
Dl /= [] -> ?UNIX_OKAY_RESULT
end.
wait_vm_stop(_,0) -> ?TIMEOUT_STOPPING_VM;
wait_vm_stop(D,N) ->
net_kernel:connect(D),
Dl = lists:filter(fun(X) -> X==D end,nodes()),
if Dl /= [] ->
receive after 1000 -> true end,
wait_vm_start(D,N-1);
Dl == [] -> ?UNIX_OKAY_RESULT
end.
flush() ->
receive
_ ->
flush()
after
0 ->
true
end.
sd(Hdl) ->
MyNode=node(),
if
MyNode =:= nonode#nohost ->
info(stdout,"~s","Error: Erlang not started with a name. Use -sname <name>"),
?VM_STARTED_WITHOUT_NAME;
MyNode /= nonode#nohost ->
Atm_daemon = ?DAEMON_NAME,
Connected = net_kernel:connect(Atm_daemon),
case Connected of
true ->
info(Hdl,"~s",["daemon process already started"]),
?UNIX_OKAY_RESULT;
false ->
info(Hdl,"~s",["starting daemon process"]),
StartString = "erl -detached -sname daemon",
os:cmd(StartString),
Vm_daemon = wait_vm_start(Atm_daemon,10),
case Vm_daemon of
?UNIX_OKAY_RESULT ->
info(Hdl,"~s",["spawning main daemon process"]),
spawn(Atm_daemon,?MODULE,start,[]), ?UNIX_OKAY_RESULT;
A -> A
end
end % case Connected %
end.
say_hi() ->
Daemon = ?DAEMON_NAME,
Connected = net_kernel:connect(Daemon),
if Connected ->
{listener,Daemon} ! {hello,self()},
receive
Response -> Response
after 10000 -> timeout end;
not Connected -> could_not_connect
end.
stop_daemon() ->
Daemon = ?DAEMON_NAME,
Connected = net_kernel:connect(Daemon),
if Connected ->
flush(),
{listener,Daemon} ! {quit,self()},
receive
bye -> wait_vm_stop(Daemon,10)
after 10000 -> ?TIMEOUT_WAITING_QUIT
end;
not Connected -> ?COULD_NOT_CONNECT
end.
shell_do(Verb) ->
{A,Hdl} = file:open('daemon_client.log',[append]),
case A of
ok ->
info(Hdl,"~s",[Verb]);
error -> error
end,
Result = handle_verb(Hdl,Verb),
info(Hdl,"Return status ~.10B",[Result]),
init:stop(Result).
%%handle_verb(_,_) -> 0;
handle_verb(Hdl,["start"]) -> sd(Hdl);
handle_verb(_,["stop"]) -> stop_daemon();
handle_verb(Hdl,["restart"]) ->
stop_daemon(),
sd(Hdl);
handle_verb(Hdl,X) ->
info(Hdl,"handle_verb failed to match ~p",[X]),
?INVALID_VERB.
kill() ->
rpc:call(?DAEMON_NAME, init, stop, []).
start(Source) ->
Source ! starting,
start().
start() ->
register(listener,self()),
case {_,Hdl}=file:open("daemon_server.log",[append]) of
{ok,Hdl} -> server(Hdl);
{error,Hdl} -> {error,Hdl}
end.
info(Hdl,Fmt,D)->
io:fwrite(Hdl,"~w"++Fmt++"~n",[erlang:localtime()] ++ D).
server(Hdl) ->
info(Hdl,"~s",["waiting"]),
receive
{hello,Sender} ->
info(Hdl,"~s~w",["hello received from",Sender]),
Sender ! hello,
server(Hdl);
{getpid,Sender} ->
info(Hdl,"~s~w",["pid request from ",Sender]),
Sender ! self(),
server(Hdl);
{quit,Sender} ->
info(Hdl,"~s~w",["quit recevied from ",Sender]),
Sender ! bye,
init:stop();
_ ->
info(Hdl,"~s",["Unknown message received"])
after
50000 ->
server(Hdl)
end.
For the reader not used to reading erlang, there some of this code is run as a result of the shell script we saw above. Other code in this file is the daemon itself. Referring back to the shell script we see that the script calls procedure shell_do. Shell_do writes log entries, calls handle_verb and exits. Handle_verb implements the different behaviours for each verb. Starting the daemon is handled by function sd, which creates the daemon by an operating system call os:cmd, waits for the erlang virtual machine to initialise, and then spawns the server code called start, which in turn calls server.
Sleep is available in erlang, through the timer functions.
http://www.erlang.org/doc/man/timer.html
For the background process, you can use the -detached cli argument.
You can specify an entry point with -s
EDIT
You can also spawn a new process from your main program:
http://www.erlang.org/doc/reference_manual/processes.html
With respect to daemonizing, consider starting your erlang program with the run_erl utility that comes with OTP. Note in particular the -daemon command line flag.

Resources