Troubles communicating with ROS2 node in docker container - docker

I'm studying ROS2. I've got a docker container with ROS2 foxy installation inside it.
This container has many other things installed, so it is preferable for me to deal with it instead of ones downloaded from DockerHub.
The container is based on Ubuntu 18.04, and my host runs Ubuntu 20.04.
Following doesn't work:
On host: $ docker run --net host -it <container name>
Inside container:
# env | grep ROS_
ROS_DOMAIN_ID=142
ROS_VERSION=2
ROS_LOCALHOST_ONLY=0
ROS_PYTHON_VERSION=3
ROS_DISTRO=foxy
# ros2 run examples_rclpy_minimal_publisher publisher_local_function
[INFO] [1611658788.451254349] [minimal_publisher]: Publishing: "Hello World: 0"
[INFO] [1611658788.930325228] [minimal_publisher]: Publishing: "Hello World: 1"
[INFO] [1611658789.430629464] [minimal_publisher]: Publishing: "Hello World: 2"
...
On the same host in another terminal:
$ source /opt/ros/foxy/setup.zsh
$ export ROS_DOMAIN_ID=142
$ env | grep ROS_
ROS_DISTRO=foxy
ROS_LOCALHOST_ONLY=0
ROS_PYTHON_VERSION=3
ROS_VERSION=2
ROS_DOMAIN_ID=142
$ ros2 run examples_rclpy_minimal_subscriber subscriber_member_function
No output from subscriber.
At the same time, I see open UDP ports:
$ sudo netstat -unlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
udp 0 0 0.0.0.0:35379 0.0.0.0:* 2103557/python3
udp 0 0 127.0.0.1:41750 0.0.0.0:* 1867221/python3
udp 0 0 0.0.0.0:42900 0.0.0.0:* 2103557/python3
udp 0 0 0.0.0.0:42900 0.0.0.0:* 1867221/python3
udp 0 0 0.0.0.0:42912 0.0.0.0:* 2103557/python3
udp 0 0 0.0.0.0:42913 0.0.0.0:* 2103557/python3
udp 0 0 0.0.0.0:42916 0.0.0.0:* 1867221/python3
udp 0 0 0.0.0.0:42917 0.0.0.0:* 1867221/python3
udp 0 0 127.0.0.1:47375 0.0.0.0:* 2103557/python3
PIDs, starting with 186xxxx belong to ros2_daemon on host, PIDs, starting with 210xxxx, belong to python, running in the container.
If I execute subscriber in another /bin/bash in the container, it works, that is, the subscriber prints messages that it receives from publisher.
Multicast UDP datagrams also work:
In container:
# ros2 multicast receive
Waiting for UDP multicast datagram...
Received from 106.xxx.xxx.xxx:45829: 'Hello World!'
On host:
$ ros2 multicast send
Sending one UDP multicast datagram...
UPDATE.
I've tried pulling standard container osrf/ros:foxy-desktop... And examples work as expected.
Publisher in container:
$ docker pull osrf/ros:foxy-desktop
$ docker run --net host -it osrf/ros:foxy-desktop
# export ROS_DOMAIN_ID=142
# env | grep ROS_
ROS_VERSION=2
ROS_PYTHON_VERSION=3
ROS_DOMAIN_ID=142
ROS_LOCALHOST_ONLY=0
ROS_DISTRO=foxy
#ros2 run examples_rclpy_minimal_publisher publisher_local_function
[INFO] [1611670054.887068490] [minimal_publisher]: Publishing: "Hello World: 0"
[INFO] [1611670055.367854925] [minimal_publisher]: Publishing: "Hello World: 1"
...
Subscriber on host:
$ ros2 run examples_rclpy_minimal_subscriber subscriber_member_function
[INFO] [1611670073.075589355] [minimal_subscriber]: I heard: "Hello World: 7"
[INFO] [1611670073.540520496] [minimal_subscriber]: I heard: "Hello World: 8"
[INFO] [1611670074.040020703] [minimal_subscriber]: I heard: "Hello World: 9"
...
Update 2:
Getting back to original container. I see two UDP sockets with the same port number 7400 in netstat. Is it OK?
Update: Yes, it is: https://stackoverflow.com/a/1694148
The same phenomenon is observed in the output of netstat above, but port number is different.
$ sudo netstat -unlp
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
...
udp 0 0 0.0.0.0:39604 0.0.0.0:* 2319288/python3
udp 0 0 0.0.0.0:7400 0.0.0.0:* 2319288/python3
udp 0 0 0.0.0.0:7400 0.0.0.0:* 2319267/python3
udp 0 0 0.0.0.0:7412 0.0.0.0:* 2319267/python3
...
And processes:
$ ps axf
...
2319287 pts/4 S+ 0:00 \_ /usr/bin/python3 /opt/ros/foxy/bin/ros2 run examples_rclpy_minimal_publisher publisher_local_function
2319288 pts/4 Sl+ 0:01 \_ /usr/bin/python3 /opt/ros/foxy/lib/examples_rclpy_minimal_publisher/publisher_local_function
...
2319050 ? Sl 0:00 /usr/bin/containerd-shim-runc-v2 -namespace moby -id ae2da482416
2319075 pts/0 Ss+ 0:00 \_ /bin/bash
2319266 pts/0 S 0:00 \_ /usr/bin/python3 /root/git/ros2_foxy/install/bin/ros2 run examples_rclpy_minimal_subscriber subscriber_member_function
2319267 pts/0 Sl 0:00 \_ /usr/bin/python3 /root/git/ros2_foxy/install/lib/examples_rclpy_minimal_subscriber/subscriber_member_function
Process with ID 2319288 is running from a host, I've accidentally cut output of ps.
Update 3
If I run docker container without --net=host, then I subscriber sees messages from publisher. I cannot afford this, because docker container is not seen in the network.
I've replaced subscriber in the container with netcat (netcat -l -u 42900) - and netcat in the container has received messages from the publisher that was working outside it. Container is run with --net=host
It suggests that everything is OK with the network in the container, but ROS2 uses it somehow incorrectly.
How do I correct it?

The last releases of Fast-DDS come with SharedMemory transport by default. Using --net=host implies both DDS participants believe they are in the same machine and they try to communicate using SharedMemory instead of UDP. Fast-DDS team will work to implement a mechanism to detect this kind of situation. Meanwhile, I can give you two solutions:
Using an XML to disable SharedMemory transport in one of the DDS participants.
<?xml version="1.0" encoding="UTF-8" ?>
<profiles xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles" >
<transport_descriptors>
<transport_descriptor>
<transport_id>CustomUdpTransport</transport_id>
<type>UDPv4</type>
</transport_descriptor>
</transport_descriptors>
<participant profile_name="participant_profile" is_default_profile="true">
<rtps>
<userTransports>
<transport_id>CustomUdpTransport</transport_id>
</userTransports>
<useBuiltinTransports>false</useBuiltinTransports>
</rtps>
</participant>
</profiles>
Enable SharedMemory between host and container. For this you should share /dev/shm:
docker run -ti --net host -v /dev/shm:/dev/shm <DOCKER_IMAGE>
Also, both applications should be run with the same UID. In my case, my docker container's user is root (UID=0). Then I had to run the host application as root.

Related

Docker | Bind for 0.0.0.0:80 failed | Port is already allocated

i've been trying all the existing commands for several hours and could not fix this problem.
i used everything covered in this Article: Docker - Bind for 0.0.0.0:4000 failed: port is already allocated.
I currently have one container: docker ps -a | meanwhile docker ps is empty
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
5ebb9289dfd1 dockware/dev:latest "/bin/bash /entrypoi…" 2 minutes ago Created TheGoodPartDocker
when i Try docker-compose up -d i get the Error:
ERROR: for TheGoodPartDocker Cannot start service shop: driver failed programming external connectivity on endpoint TheGoodPartDocker (3b59ebe9366bf1c4a848670c0812935def49656a88fa95be5c4a4be0d7d6f5e6): Bind for 0.0.0.0:80 failed: port is already allocated
I've tried to remove everything using: docker ps -aq | xargs docker stop | xargs docker rm
Or remove ports: fuser -k 80/tcp
even deleting networks:
sudo service docker stop
sudo rm -f /var/lib/docker/network/files/local-kv.db
or just manually shut down stop and run:
docker-compose down
docker stop 5ebb9289dfd1
docker rm 5ebb9289dfd1
here is also my netstat : netstat | grep 80
unix 3 [ ] STREAM CONNECTED 20680 /mnt/wslg/PulseAudioRDPSink
unix 3 [ ] STREAM CONNECTED 18044
unix 3 [ ] STREAM CONNECTED 32780
unix 3 [ ] STREAM CONNECTED 17805 /run/guest-services/procd.sock
And docker port TheGoodPartDocker gives me no result.
I also restarted my computer, but nothing works :(.
Thanks for helping
Obviously port 80 is already occupied by some other process. You need to stop the process, before you start the container. To find out the process use ss:
$ ss -tulpn | grep 22
tcp LISTEN 0 128 0.0.0.0:22 0.0.0.0:* users:(("sshd",pid=1187,fd=3))
tcp LISTEN 0 128 [::]:22 [::]:* users:(("sshd",pid=1187,fd=4))

Create a single container instead of 3 different containers

I saw you were setting up a Docker-compose file but it which creates 3 different containers but wanted to combine those 3 containers to a single container/image instead of setting it up as multiple containers at deployment system.
My current list of containers are as follow:
my main container containing my code that I built using Docker File
rest 2 are containers of Redis and Postress but wanted to combine them in 1.
Is there any way to do so?
First of all, running redis, postgres and your "main container" in one container is NOT best practice.
Typically you should have 3 separate containers (single app per container) communicating over the network. Sometimes we want to run two or more lightweight services inside the same container but redis and postgres aren't such services.
I recommend reading: best practices for building containers.
However, it's possible to have multiple services in the same docker container using the supervisord process management system.
I will run both redis and postgres services in one docker container (it's similar to your issue) to illustrate you how it works. It's for demonstration purposes only.
This is a directory structure, we only need Dockerfile and supervisor.conf (supervisord config file):
$ tree example_container/
example_container/
├── Dockerfile
└── supervisor.conf
First, I created a supervisord configuration file with redis and postgres services defined:
$ cat example_container/supervisor.conf
[supervisord]
nodaemon=true
[program:redis]
command=redis-server # command to run redis service
autorestart=true
stderr_logfile=/dev/stdout
stderr_logfile_maxbytes = 0
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes = 0
[program:postgres]
command=/usr/lib/postgresql/12/bin/postgres -D /var/lib/postgresql/12/main/ -c config_file=/etc/postgresql/12/main/postgresql.conf # command to run postgres service
autostart=true
autorestart=true
stderr_logfile=/dev/stdout
stderr_logfile_maxbytes = 0
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes = 0
user=postgres
environment=HOME="/var/lib/postgresql",USER="postgres"
Next I created a simple Dockerfile:
$ cat example_container/Dockerfile
FROM ubuntu:latest
ARG DEBIAN_FRONTEND=noninteractive
# Installing redis and postgres
RUN apt-get update && apt-get install -y supervisor redis-server postgresql-12
# Copying supervisor configuration file to container
ADD supervisor.conf /etc/supervisor.conf
# Initializing redis and postgres services using supervisord
CMD ["supervisord","-c","/etc/supervisor.conf"]
And then I built the docker image:
$ docker build -t example_container:v1 .
Finally I ran and tested docker container using the image above:
$ docker run --name multi_services -dit example_container:v1
472c7b2eac7441360126f8fcd0cc80e0e63ac3039f8195715a3a400f6288a236
$ docker exec -it multi_services bash
root#472c7b2eac74:/# ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.7 0.1 27828 23372 pts/0 Ss+ 10:04 0:00 /usr/bin/python3 /usr/bin/supervisord -c /etc/supervisor.conf
postgres 8 0.1 0.1 212968 28972 pts/0 S 10:04 0:00 /usr/lib/postgresql/12/bin/postgres -D /var/lib/postgresql/12/main/ -c config_file=/etc/postgresql/12/main/postgresql.conf
root 9 0.1 0.0 47224 6216 pts/0 Sl 10:04 0:00 redis-server *:6379
...
root#472c7b2eac74:/# netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:6379 0.0.0.0:* LISTEN 9/redis-server *:6
tcp 0 0 127.0.0.1:5432 0.0.0.0:* LISTEN 8/postgres
tcp6 0 0 :::6379 :::* LISTEN 9/redis-server *:6
As you can see it is possible to have multiple services in a single container but this is a NOT recommended approach that should be used ONLY for testing.
Regarding Kubernetes, you can group your containers in a single pod, as a deployment unit.
A Pod is the smallest deployable units of computing that you can create and manage in Kubernetes.
It is a group of one or more containers, with shared storage and network resources, and a specification for how to run the containers.
A Pod's contents are always co-located and co-scheduled, and run in a shared context.
That would be more helpful than trying to merge containers together in one container.

Error starting userland proxy: listen tcp 0.0.0.0:7050: bind: address already in use

I'm setting up a Hyperledger Fabric private network on Linux and got the message while running ./byfn.sh up
as I'm a newbie in Ubuntu and docker I think that the port needs to be changed in order to fix the problem, however, I have no clue in doing so. Any help would be appreciated.
alaa#ubuntu:~/fabric-samples/first-network$ sudo netstat -pna | grep 7050
tcp6 0 0 :::7050 :::* LISTEN 3682/docker-proxy
did a netstat to check the port and its docker-proxy
alaa#ubuntu:~/fabric-samples/first-network$ sudo ./byfn.sh up
Starting with channel 'mychannel' and CLI timeout of '10' seconds and CLI delay of '3' seconds
Continue? [Y/n] y
proceeding ...
2019-05-19 14:07:22.820 UTC [main] main -> INFO 001 Exiting.....
LOCAL_VERSION=1.1.0
DOCKER_IMAGE_VERSION=1.1.0
Creating network "net_byfn" with the default driver
Creating volume "net_orderer.example.com" with default driver
Creating volume "net_peer0.org1.example.com" with default driver
Creating volume "net_peer1.org1.example.com" with default driver
Creating volume "net_peer0.org2.example.com" with default driver
Creating volume "net_peer1.org2.example.com" with default driver
Creating orderer.example.com ... error
Creating peer1.org2.example.com ...
Creating peer1.org1.example.com ...
Creating peer0.org1.example.com ...
Creating peer1.org2.example.com ... done
Creating peer1.org1.example.com ... done
Creating peer0.org1.example.com ... done
Creating peer0.org2.example.com ... done
ERROR: for orderer.example.com Cannot start service orderer.example.com: b'driver failed programming external connectivity on endpoint orderer.example.com (60d170dbc933d3c2de9eacd1bb6c7842cf79a52b3a938c9e0e69d1bd55f5e1a9): Error starting userland proxy: listen tcp 0.0.0.0:7050: bind: address already in use'
ERROR: Encountered errors while bringing up the project.
ERROR !!!! Unable to start network
alaa#ubuntu:~/fabric-samples/first-network$ sudo netstat -pna | grep 7050
tcp6 0 0 :::7050 :::* LISTEN 3682/docker-proxy
Well, first of all for any kind of hyperledger tutorial, u better follow the official link, cos most of other sources were also taken from that one: https://hyperledger-fabric.readthedocs.io/en/release-1.4/
Secondly,bring down the network, stop&remove all running&previous containers, restart docker, re-run the network properly, should work fine:
$./byfn.sh down
$docker ps -qa|xargs docker rm
$sudo systemctl daemon-reload
$sudo systemctl restart docker
$cd....fabric-samples/first-network
$./byfn.sh -m generate
$./byfn.sh -m up

How to add a docker health check to test a tcp port is open?

I have a server which is running an apache hive service on port 9083. The thing is it doesn't support http protocol (but uses thrift protocol).so I can't simply add
HEALTHCHECK CMD http://127.0.0.1:9083 || exit 1 # this check fails
All I want is to check if that port is open. I have netstat and curl on server but not nc.
So far I tried the below options, but none of them is suitable as a health check.
netstat -an | grep 9083 | grep LISTEN | echo $? # This works
netstat -an | grep 9084 | grep LISTEN | echo $? # but so does this
The problem as I interpret from the above is it's simply telling me my syntax is correct, but not really testing if that port is really listening
because when I do netstat -an I get the following output,which clearly shows only 9083 is listening but not 9084
Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State tcp 0 0
0.0.0.0:9083 0.0.0.0:* LISTEN tcp 0 0
Though answering an old question, for future googlers, the following worked for me:
HEALTHCHECK CMD netstat -an | grep 9083 > /dev/null; if [ 0 != $? ]; then exit 1; fi;
The shorter version of it:
netstat -ltn | grep -c 9083
Used options:
netstat:
-l - display listening server sockets
-t - display TCP sockets only
-n - don't resolve names
grep
-c - returns a number of founded lines, but it also gives a useful exit code; 0 if found, 1 if not found
You can use /dev/tcp
Like this :
printf "GET / HTTP/1.1\n\n" > /dev/tcp/127.0.0.1/9083
For more information, you can check this : http://www.tldp.org/LDP/abs/html/devref1.html#DEVTCP
Piotr Perzynas answer is quite good, but it will also return 0 if there‘s a port like 19083 because it finds the substring 9083 in that line.
a better check would be:
netstat -ltn | grep -c ":9083"
I really loved Wassim Dhif answer. Mostly because it does not depend on netstat.
Netstat was obsolete before the question was asked.
From netstat's manpage:
Note
This program is obsolete. Replacement for netstat is ss. Replacement for netstat -r is ip route. Replacement for netstat -i is ip -s link. Replacement for netstat -g is ip maddr.
And you need netstat (or ss) installed in the container. You only want to check if a port is open. Wassim Dhif's answer just needs bash (and yes, not every image has it). In my experience, you usually want the image as light as possible.
In my compose I used it as follows:
healthcheck:
test: "bash -c 'printf \"GET / HTTP/1.1\n\n\" > /dev/tcp/127.0.0.1/8080; exit $?;'"
interval: 30s
timeout: 10s
retries: 3
start_period: 30s
Note that the string test is equivalent to specifying CMD-SHELL followed by that string (from the Compose Specification

Docker remote api don't restart after my computer restart

Last week I struggled to make my docker remote api working. As it is running on VM, I have not restart my VM since then. Today I finally restarted my VM and it is not working any more (docker and docker-compose are working normally, but not docker remote api). My docker init file looks like this: /etc/init/docker.conf.
description "Docker daemon"
start on filesystem and started lxc-net
stop on runlevel [!2345]
respawn
script
/usr/bin/docker -H tcp://0.0.0.0:4243 -d
end script
# description "Docker daemon"
# start on (filesystem and net-device-up IFACE!=lo)
# stop on runlevel [!2345]
# limit nofile 524288 1048576
# limit nproc 524288 1048576
respawn
kill timeout 20
.....
.....
Last time I made setting indicated here this
I tried nmap to see if port 4243 is opened.
ubuntu#ubuntu:~$ nmap 0.0.0.0 -p-
Starting Nmap 7.01 ( https://nmap.org ) at 2016-10-12 23:49 CEST
Nmap scan report for 0.0.0.0
Host is up (0.000046s latency).
Not shown: 65531 closed ports
PORT STATE SERVICE
22/tcp open ssh
43978/tcp open unknown
44672/tcp open unknown
60366/tcp open unknown
Nmap done: 1 IP address (1 host up) scanned in 1.11 seconds
as you can see, the port 4232 is not opened.
when I run:
ubuntu#ubuntu:~$ echo -e "GET /images/json HTTP/1.0\r\n" | nc -U
This is nc from the netcat-openbsd package. An alternative nc is available
in the netcat-traditional package.
usage: nc [-46bCDdhjklnrStUuvZz] [-I length] [-i interval] [-O length]
[-P proxy_username] [-p source_port] [-q seconds] [-s source]
[-T toskeyword] [-V rtable] [-w timeout] [-X proxy_protocol]
[-x proxy_address[:port]] [destination] [port]
I run this also:
ubuntu#ubuntu:~$ sudo docker -H=tcp://0.0.0.0:4243 -d
flag provided but not defined: -d
See 'docker --help'.
I restart my computer many times and try a lot of things with no success.
I already have a group named docker and my user is in:
ubuntu#ubuntu:~$ groups $USER
ubuntu : ubuntu adm cdrom sudo dip plugdev lpadmin sambashare docker
Please tel me what is wrong.
Your startup script contains an invalid command:
/usr/bin/docker -H tcp://0.0.0.0:4243 -d
Instead you need something like:
/usr/bin/docker daemon -H tcp://0.0.0.0:4243
As of 1.12, this is now (but docker daemon will still work):
/usr/bin/dockerd -H tcp://0.0.0.0:4243
Please note that this is opening a port that gives remote root access without any password to your docker host.
Anyone that wants to take over your machine can run docker run -v /:/target -H your.ip:4243 busybox /bin/sh to get a root shell with your filesystem mounted at /target. If you'd like to secure your host, follow this guide to setting up TLS certificates.
I finally found www.ivankrizsan.se and it is working find now. Thanks to this guy (or girl) ;).
This settings work for me on ubuntu 16.04. Here is how to do :
Edit this file /lib/systemd/system/docker.service and replace the line ExecStart=/usr/bin/dockerd -H fd:// with
ExecStart=/usr/bin/docker daemon -H fd:// -H tcp://0.0.0.0:4243
Save the file
restart with :sudo service docker restart
Test with : curl http://localhost:4243/version
Result: you should see something like this:
{"Version":"1.11.0","ApiVersion":"1.23","GitCommit":"4dc5990","GoVersion" "go1.5.4","Os":"linux","Arch":"amd64","KernelVersion":"4.4.0-22-generic","BuildTime":"2016-04-13T18:38:59.968579007+00:00"}
Attention :
Remain aware that 0.0.0.0 is not good for security, for more security, you should use 127.0.0.1

Resources