Erlang node connectivity problem - erlang

Struggling with connecting 2 nodes running on separate boxes. Tried to
make sure that there is no usual problems with cookie synchronization,
DNS or firewall.
First, I run epmd in debug mode as recommended by Erlang docs:
epmd -d -d
Then on box #1:
erl -name xmpp1#server1.net -kernel inet_dist_listen_min 6000 inet_dist_listen_max 6050 -setcookie testcookie
and on box #2:
erl -name xmpp2#server2.net -kernel inet_dist_listen_min 6000 inet_dist_listen_max 6050 -setcookie testcookie
No luck with ping. For example, on box #2:
Erlang (BEAM) emulator version 5.6.4 [source] [64-bit] [smp:4] [async-threads:0] [kernel-poll:false]
Eshell V5.6.4 (abort with ^G)
(xmpp2#server2.net)1> net_adm:ping('xmpp1#server1.net').
pang
epmd on server1.net shows following:
epmd: Sun Sep 12 01:40:32 2010: opening connection on file descriptor 6
epmd: Sun Sep 12 01:40:32 2010: got 8 bytes
***** 00000000 00 06 7a 78 6d 70 70 31 |..zxmpp1|
epmd: Sun Sep 12 01:40:32 2010: ** got PORT2_REQ
epmd: Sun Sep 12 01:40:32 2010: got 18 bytes
***** 00000000 77 00 17 70 4d 00 00 05 00 05 00 05 78 6d 70 70 |w..pM.......xmpp|
***** 00000010 31 00 |1.|
epmd: Sun Sep 12 01:40:32 2010: ** sent PORT2_RESP (ok) for "xmpp1"
epmd: Sun Sep 12 01:40:32 2010: closing connection on file descriptor 6
i.e., appears to receive ping request from second node and respond with ok.
Tshark listening on epmd port (TCP 4369) gives following (I replaced real IPs with server names):
1 0.000000 server2.net -> server1.net TCP 43809 > epmd [SYN] Seq=0 Win=5840 Len=0 MSS=1460 SACK_PERM=1 TSV=776213773 TSER=0 WS=5
2 0.000433 server1.net -> server2.net TCP epmd > 43809 [SYN, ACK] Seq=0 Ack=1 Win=5792 Len=0 MSS=1460 SACK_PERM=1 TSV=1595930818 TSER=776213773 WS=6
3 0.000483 server2.net -> server1.net TCP 43809 > epmd [ACK] Seq=1 Ack=1 Win=5856 Len=0 TSV=776213773 TSER=1595930818
4 0.000545 server2.net -> server1.net EPMD 43809 > epmd [PSH, ACK] Seq=1 Ack=1 Win=5856 Len=8 TSV=776213773 TSER=1595930818
5 0.001445 server1.net -> server2.net TCP epmd > 43809 [ACK] Seq=1 Ack=9 Win=5824 Len=0 TSV=1595930818 TSER=776213773
6 0.001466 server1.net -> server2.net EPMD epmd > 43809 [PSH, ACK] Seq=1 Ack=9 Win=5824 Len=18 TSV=1595930818 TSER=776213773
7 0.001474 server2.net -> server1.net TCP 43809 > epmd [ACK] Seq=9 Ack=19 Win=5856 Len=0 TSV=776213773 TSER=1595930818
8 0.001481 server1.net -> server2.net TCP epmd > 43809 [FIN, ACK] Seq=19 Ack=9 Win=5824 Len=0 TSV=1595930818 TSER=776213773
9 0.001623 server2.net -> server1.net TCP 43809 > epmd [FIN, ACK] Seq=9 Ack=20 Win=5856 Len=0 TSV=776213773 TSER=1595930818
10 0.001990 server1.net -> server2.net TCP epmd > 43809 [ACK] Seq=20 Ack=10 Win=5824 Len=0 TSV=1595930818 TSER=776213773
So it looks to me that there is no firewall issues, as epmd instances talk to each other. What am I missing?
Your advise is very much appreciated!
Best regards,
Boris

I am also a newbie to erlang
My first few experiments were with binding absolute IP address.
Erl -name coder#192.168.1.2 -setcookie thusismadness
Erl -name dumb#192.168.1.3 -setcookie thusismadness
If you are connecting over internet make sure that you open ports specified in inet_dist_listen_min & inet_dist_listen_max in your router (app port) + epmd port.
Server1 -> router1 ports open for epmd & app port
Server2 -> router2 ports open for epmd & app port
Please bind over IP address first before using namespace.

Turns out to be a firewall issue. Big thanks to Michael Santos who showed me the right direction. Check out his answer here.

Related

displaying time in microsecond precision in tshark

It's possible in Wireshark (View -> Time Display Formats -> Microseconds), see attached image
Wireshark settings. Can somebody please share how to do this via tshark? I can see there are "-t" and "-u" options and they take care of some bits of date-time formatting but not second precision part.
I need this to automate some workflow. I get capture files having both micro and nano seconds, but filtered text output must be normalized.
If your capture file contains timestamps of such precision, -t a could print it, see my example:
% tshark -r packetcapture.cap -t a
1 10:23:40.232514 192.168.3.140 → 192.168.1.1 TLSv1.2 0 Application Data
2 10:23:40.232524 192.168.1.1 → 192.168.3.140 TCP 0 443 → 60152 [ACK] Seq=1 Ack=77 Win=514 Len=0 TSval=1643984880 TSecr=1054195834
3 10:23:40.232785 192.168.1.1 → 192.168.3.140 TLSv1.2 0 Application Data
4 10:23:40.235273 192.168.3.140 → 192.168.1.1 TCP 0 60152 → 443 [ACK] Seq=77 Ack=7095 Win=1937 Len=0 TSval=1054195837 TSecr=1643984880
5 10:23:40.235594 192.168.3.140 → 192.168.1.1 TCP 0 [TCP Window Update] 60152 → 443 [ACK] Seq=77 Ack=7095 Win=2048 Len=0 TSval=1054195837 TSecr=1643984880
Other formats described in tshark man page.

docker tcp retransmission in docker network

TCP rentransmission occurs on any docker network.
simple test: create VM in Azure, centos 7,7
yum update
yum install docker
systemctl start docker
docker run --name mynginx1 -P -d nginx
tshark -tad -i any -Y "tcp.analysis.retransmission"
curl localhost:32768
this results in occurence of TCP Retransmission.
[root#vm1 ~]# tshark -tad -i any -Y "tcp.analysis.retransmission"
Running as user "root" and group "root". This could be dangerous.
Capturing on 'any'
121 2020-04-24 07:26:55.504210673 109.81.211.189 -> 10.0.0.4 SSH 92 [TCP Retransmission] Encrypted request packet len=36
418 2020-04-24 07:27:04.982215355 109.81.211.189 -> 10.0.0.4 SSH 92 [TCP Retransmission] Encrypted request packet len=36
572 2020-04-24 07:27:10.746826933 172.17.0.1 -> 172.17.0.2 HTTP 147 [TCP Retransmission] GET / HTTP/1.1
576 2020-04-24 07:27:10.747858244 172.17.0.2 -> 172.17.0.1 TCP 307 [TCP Retransmission] http > 40514 [PSH, ACK] Seq=1 Ack=80 Win=29056 Len=239 TSval=1217913 TSecr=1217912
580 2020-04-24 07:27:10.747930345 172.17.0.2 -> 172.17.0.1 TCP 680 [TCP Retransmission] http > 40514 [PSH, ACK] Seq=240 Ack=80 Win=29056 Len=612 TSval=1217914 TSecr=1217913[Reassembly error, protocol TCP: New fragment overlaps old data (retransmission?)]
the same problem on Kubernetes (tested with flannel plugin)
this issue significantly reduce the performance of container, in our case high performance message parser inside docker has 4x less results. Using docker host network resolves the issue. But using bridge network causes this issue.
Please any advice/help?
for reference - the reason of retransmission is seen from wireshark as outoforder/duplicate
enter image wireshark trace detail
Finally we identified the problem in used image based on Alpine. After we moved the docker base image into debian/ubuntu retransmission problem gone.

Docker in docker routing within Kubernetes

I've network related issue on the Kubernetes host, using Calico network layer. For continuous integration I need to run docker in docker, but running simple docker build with this Dockerfile:
FROM praqma/network-multitool AS build
RUN route
RUN ping -c 4 google.com
RUN traceroute google.com
produces output:
Step 1/4 : FROM praqma/network-multitool AS build
---> 3619cb81e582
Step 2/4 : RUN route
---> Running in 80bda13a9860
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
default 172.17.0.1 0.0.0.0 UG 0 0 0 eth0
172.17.0.0 * 255.255.0.0 U 0 0 0 eth0
Removing intermediate container 80bda13a9860
---> d79e864eafaf
Step 3/4 : RUN ping -c 4 google.com
---> Running in 76354a92a413
PING google.com (216.58.201.110) 56(84) bytes of data.
--- google.com ping statistics ---
4 packets transmitted, 0 received, 100% packet loss, time 53ms
---> 3619cb81e582
Step 4/4 : RUN traceroute google.com
---> Running in 3aa7908347ba
traceroute to google.com (216.58.201.110), 30 hops max, 46 byte packets
1 172.17.0.1 (172.17.0.1) 0.009 ms 0.005 ms 0.003 ms
Seems docker container has invalid routing while created off Kubernetes. Pods orchestrated by Kubernetes can access internet normally.
bash-5.0# ping -c 3 google.com
PING google.com (216.58.201.110) 56(84) bytes of data.
64 bytes from prg03s02-in-f14.1e100.net (216.58.201.110): icmp_seq=1 ttl=55 time=0.726 ms
64 bytes from prg03s02-in-f14.1e100.net (216.58.201.110): icmp_seq=2 ttl=55 time=0.586 ms
64 bytes from prg03s02-in-f14.1e100.net (216.58.201.110): icmp_seq=3 ttl=55 time=0.451 ms
--- google.com ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 10ms
rtt min/avg/max/mdev = 0.451/0.587/0.726/0.115 ms
bash-5.0# route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
default 169.254.1.1 0.0.0.0 UG 0 0 0 eth0
169.254.1.1 * 255.255.255.255 UH 0 0 0 eth0
bash-5.0# traceroute google.com
traceroute to google.com (216.58.201.110), 30 hops max, 46 byte packets
1 10-68-149-194.kubelet.kube-system.svc.kube.example.com (10.68.149.194) 0.006 ms 0.005 ms 0.004 ms

Port not accessible even after being exposed. Connection refused

we created a docker container like this:
docker container create \
--name orderer \
--network dscsa_net \
--workdir $WORK_DIR \
--expose=7050 \
hyperledger/fabric-orderer:1.3.0 ./start-orderer.sh
but are unable to connect to port 7050 on the container.
root#dcee7e74266f:/home# nc -vz 10.0.0.194 7050
nc: connect to 10.0.0.194 port 7050 (tcp) failed: Connection refused
we are able to ping the container:
root#dcee7e74266f:/home# ping 10.0.0.194
PING 10.0.0.194 (10.0.0.194) 56(84) bytes of data.
64 bytes from 10.0.0.194: icmp_seq=1 ttl=64 time=0.810 ms
64 bytes from 10.0.0.194: icmp_seq=2 ttl=64 time=1.30 ms
64 bytes from 10.0.0.194: icmp_seq=3 ttl=64 time=0.668 ms
64 bytes from 10.0.0.194: icmp_seq=4 ttl=64 time=1.10 ms
64 bytes from 10.0.0.194: icmp_seq=5 ttl=64 time=0.631 ms
^C
--- 10.0.0.194 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4006ms
rtt min/avg/max/mdev = 0.631/0.902/1.301/0.261 ms
and also see a process listening on port 7050 on the container:
root#9756199efefa:/home# netstat -tuplen
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State User Inode PID/Program name
tcp 0 0 127.0.0.1:7050 0.0.0.0:* LISTEN 0 10097930 7/orderer
tcp 0 0 127.0.0.11:34865 0.0.0.0:* LISTEN 0 10097705 -
udp 0 0 127.0.0.11:51385 0.0.0.0:* 0 10097704 -
what is going on here? how can we fix this?
EDIT: we are on a overlay network. The publish flag suggested in the answer is n/a as we are doing container to container communication. Anyway we tried it and it doesn't work.
There is one thing we have noticed which is if we run:
docker network inspect <our-network-name>
Among other things, it prints out a containers section but in that section only the containers on the host from which docker network inspect is executed are listed. The containers hosted on other nodes are not listed (also mentioned here).
we verified that if we run:
docker node ls
all the nodes are part of the swarm.
It seems other people have also run into this issue e.g., here but what is the solution?
Note: we are able to connect to another container running a different service exposed on port 7054. This container was created without even using the expose flag.
root#dcee7e74266f:/home# nc -zv 10.0.0.164 7054
Connection to 10.0.0.164 7054 port [tcp/*] succeeded!
Did further debugging with tcpdump and output of tcpdump is identical to the output when someone tries to connect to a port on which no process is listening. But as shown earlier netstat shows a process that is listening and we can connect to the process from localhost.
Output of tcpdump:
root#dcee7e74266f:/test# tcpdump -s0 host 10.0.0.195
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
23:44:45.978583 IP dcee7e74266f.52148 > orderer.dscsa_net.7050: Flags [S], seq 3845506108, win 28200, options [mss 1410,sackOK,TS val 4203049443 ecr 0,nop,wscale 7], length 0
23:44:45.979324 IP orderer.dscsa_net.7050 > dcee7e74266f.52148: Flags [R.], seq 0, ack 3845506109, win 0, length 0
The R flag tells client to reset the connection.
Output of traceroute:
root#dcee7e74266f:/test# traceroute 10.0.0.195
traceroute to 10.0.0.195 (10.0.0.195), 30 hops max, 60 byte packets
1 orderer.dscsa_net (10.0.0.195) 1.008 ms 0.900 ms 0.872 ms
Expose only sets metadata on the image or container, it does not make the port externally accessible. The option you are looking for is publish:
docker container create \
--name orderer \
--network dscsa_net \
--workdir $WORK_DIR \
--publish=7050:7050 \
hyperledger/fabric-orderer:1.3.0 ./start-orderer.sh
Solved this issue thanks to 1. The server listening to 127.0.0.1 was the problem. Once we changed the listening address to 0.0.0.0 (shows as ::: in netstat output below), we are able to connect to the server:
root#e9766a94d102:/home# netstat -tuplen
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State User Inode PID/Program name
tcp 0 0 127.0.0.11:37641 0.0.0.0:* LISTEN 0 12821468 -
tcp6 0 0 :::7050 :::* LISTEN 0 12821696 7/orderer
udp 0 0 127.0.0.11:51855 0.0.0.0:* 0 12821467 -
there is no need for either expose or publish flags. note to self: wasted 1.5 days on this.

Unable to ping docker host but the world is reachable

I am able to ping rest of the world but not the host from container in which docker container is running. I am sure someone encountered this issue before
See below details
Ubuntu 14.04.2 LTS (GNU/Linux 3.16.0-30-generic x86_64
**Container is using "bridge" network
Docker version 18.06.1-ce, build e68fc7a**
IP address for eth0: 135.25.87.162
IP address for eth1: 192.168.122.55
IP address for eth2: 135.21.171.209
IP address for docker0: 172.17.42.1
route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
default 135.21.248.1 0.0.0.0 UG 0 0 0 eth1
135.21.171.192 * 255.255.255.192 U 0 0 0 eth2
135.21.248.0 * 255.255.255.0 U 0 0 0 eth1
135.25.87.128 * 255.255.255.192 U 0 0 0 eth0
172.17.0.0 * 255.255.0.0 U 0 0 0 docker0
192.168.122.0 * 255.255.255.192 U 0 0 0 eth1
#ping commands from container
# ping google.com
PING google.com (64.233.177.113): 56 data bytes
64 bytes from 64.233.177.113: icmp_seq=0 ttl=29 time=51.827 ms
64 bytes from 64.233.177.113: icmp_seq=1 ttl=29 time=50.184 ms
64 bytes from 64.233.177.113: icmp_seq=2 ttl=29 time=50.991 ms
^C--- google.com ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max/stddev = 50.184/51.001/51.827/0.671 ms
# ping 135.25.87.162
PING 135.25.87.162 (135.25.87.162): 56 data bytes
^C--- 135.25.87.162 ping statistics ---
4 packets transmitted, 0 packets received, 100% packet loss
root#9ed17e4c2ee3:/opt/app/tomcat#

Resources