I am new to Erlang and RabbitMQ.
I have a node on RabbitMQ on CentOS which I had to reset to restart the message queues. Ever since the restart, the Erlang refuses to start the node. There was an erlang_vm corrupted error that was fixed with a rabbit remove and restart. I've tried net_kerlnel start in erlang shell but it fails.
[root#directadmin ~]# erl
Erlang R16B03 (erts-5.10.4) [source] [64-bit] [smp:4:4] [async-threads:10] [hipe] [kernel-poll:false]
Eshell V5.10.4 (abort with ^G)
1> node().
nonode#nohost
2> net_kernel:start([rabbit, shortnames]).
{error,
{{shutdown,
{failed_to_start_child,net_kernel,{'EXIT',nodistribution}}},
{child,undefined,net_sup_dynamic,
{erl_distribution,start_link,[[rabbit,shortnames]]},
permanent,1000,supervisor,
[erl_distribution]}}}
3>
=INFO REPORT==== 26-Jan-2017::18:58:36 ===
Protocol: "inet_tcp": the name rabbit#directadmin seems to be in use by another Erlang node
I've noticed that someone else had a similar issue and they cited that fixing rule set in iptables resolved their issue. I am not sure how that is done. I've tried service iptables restart but that didn't make any difference
http://erlang.org/pipermail/erlang-questions/2015-October/086270.html
When I try run rabbitmqctl stop_app I get this error
[root#directadmin ~]# rabbitmqctl stop_app
Stopping node rabbit#directadmin ...
Error: erlang_vm_restart_needed
When I try running 'rabbitmqctl stop' I get the vm corrupted error
[root#directadmin ~]# rabbitmqctl stop
Stopping and halting node rabbit#directadmin ...
Error: {badarg,[{io,format,
[standard_error,
"Erlang VM I/O system is damaged, restart needed~n",[]],
[]},
{rabbit_log,handle_damaged_io_system,0,
[{file,"src/rabbit_log.erl"},{line,110}]},
{rabbit_log,with_local_io,1,
[{file,"src/rabbit_log.erl"},{line,95}]},
{rabbit,'-stop_and_halt/0-after$^0/0-0-',0,
[{file,"src/rabbit.erl"},{line,434}]},
{rabbit,stop_and_halt,0,[{file,"src/rabbit.erl"},{line,431}]},
{rpc,'-handle_call_call/6-fun-0-',5,
[{file,"rpc.erl"},{line,187}]}]}
The disk was full maybe due to the errors being written to log files. I deleted logs that occupied the most space in var/log and then ran yum erase erlang followed by a clean reinstall of erlang and rabbitmq. This resolved the issue. Thank you everyone for your contribution!
You need rabbitmqctl stop, not just rabbitmqctl stop_app.
According to the documentation, stop_app "stops the RabbitMQ application, leaving the Erlang node running", while stop "stops the Erlang node on which RabbitMQ is running".
Issue is coming from the fact that epmd is not started.
You need to start epmd manually or to by providing a node name when launching erl. This not specific to rabbitmq distribution.
http://erlang.org/documentation/doc-8.0/erts-8.0/doc/html/epmd.html
This is embarassing, but I am totally stuck and wasted the better part of this morning. I have an Erlang app release created by relx, deployed and running in a Docker container. I need to get to the shell on the running node, but I'm failing to do so. Here is what happens:
$ docker exec -it 770b497d7f27 /bin/bash
[root#ff /]# /app/bin/ff
Usage: ff {start|start_boot <file>|foreground|stop|restart|reboot|pid|ping|console|console_clean|console_boot <file>|attach|remote_console|upgrade|escript|rpc|rpcterms}
[root#ff /]# /app/bin/ff ping
pong
[root#ff /]# /app/bin/ff attach
Can't access pipe directory /tmp/erl_pipes/ff#127.0.0.1/: No such file or directory
[root#ff /]# /app/bin/ff remote_console
Eshell V7.1 (abort with ^G)
(remshfbfbd4dd-ff#127.0.0.1)1> ^G
Eshell V7.1 (abort with ^G)
(remshfbfbd4dd-ff#127.0.0.1)1>
And that's it - I can exit with q()..
There is no erl_pipes in /tmp.
Control-G seems to be captured by Docker. I cannot get to the "User switch command" menu.
Even running a pure Erlang shell is not so easy:
[root#ff /]# /app/erts-7.1/bin/erl
{"init terminating in do_boot",{'cannot get bootfile','/app/bin/start.boot'}}
Crash dump is being written to: erl_crash.dump...done
init terminating in do_boot ()
I have run out of ideas. Any help would be appreciated.
Found a workaround, managed to get ^G working by overriding the default "dumb" terminal docker sets:
export TERM=xterm
After this ^G works, starting a remote shell works, and I'm a happy camper! Would be glad to know why neither the attach nor the remote_console commands work though.
In shell I typed bin/dev page foo and shell returned Node is not running, I checked my logs and noticed the message epmd: epmd: node name already occupied nitrogen
Then, in shell I typed epmd -names and it returned
epmd: up and running on port 4369 with data:
name nitrogen at port 61109
Running epmd -debug gives
epmd: Thu Jun 27 01:01:52 2013: epmd running - daemon = 0
epmd: Thu Jun 27 01:01:52 2013: there is already a epmd running at port 4369
I cannot stop the node, and when I try apparently it is active in the db
epmd: local epmd responded with <>
Killing not allowed - living nodes in database.
In Eshell, I received the following
=ERROR REPORT==== 27-Jun-2013::00:49:53 ===
** Connection attempt from disallowed node 'nitrogen_maint_19141#127.0.0.1' **
Is there a method to get Eshell to recognize this node, in order to run bin/dev function?
I've noticed you posting on the Nitrogen mailing list, and as I understand it, you've got it straightened out, but in this situation, I'd kill the running node manually with a ps aux | grep nitrogen, then kill the process it finds with a simple kill XYZ.
That, or, I've seen the "Node is not running" thing pop up when the process was launched with a different user, such that you don't have access to the erlang pipe.
Admittedly, my advice isn't terribly scientific (killing a process is pretty nasty), but it's a simple solution if for whatever reason something got hosed during launching and you're unable to attach to the node.
I runed:netstat -lputn to find ount which program is listening on port 8080,but got blow output:
As you can see no pid or program name got shown,why?
I found it:
ps -ef|grep 8080
It turns out jenkins
I am trying to setup a tsung cluster on two ec2 instances:
Master - ip-10-212-101-85.ec2.internal
Slave - ip-10-116-39-86.ec2.internal
Both have erlang (R15B) and tsung (1.4.2) installed, and install-path is same on both of them.
I can do ssh from Master to Slave and vice versa without password.
Firewall is stopped on both the machines (service iptables stop)
On Master, the attempt to start a erlang slave agent result in {error,timeout}:
[root#ip-10-212-101-85 ~]# erl -rsh ssh -sname foo -setcookie mycookie
Erlang R15B (erts-5.9) [source] [64-bit] [async-threads:0] [hipe] [kernel-poll:false]
Eshell V5.9 (abort with ^G)
(foo#ip-10-212-101-85)1> slave:start('ip-10-116-39-86',bar,"-setcookie mycookie").
{error,timeout}
On Slave, the beam comes up for few seconds then it crashes. The erl_crash.dump can be found here
I am stuck with error, any clue will be very helpful.
PS:
On both machine the /etc/hosts is same, the file looks like below:
127.0.0.1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
10.212.101.85 ip-10-212-101-85.ec2.internal
10.116.39.86 ip-10-116-39-86.ec2.internal
Looks like "service iptables stop" on individual nodes is not sufficient.
In the Security Group that is applied on the VMs, I added the a new rule that opens port-range 0 - 65535 for all.
This solved the problem.
If that's all verbatim, then the problem is likely slave:start('ip-10-116-39-86',bar,"-sttcookie mycookie"). - Try slave:start('ip-10-116-39-86',bar,"-setcookie mycookie"). instead.