Unable to start Flink JobManager container in Docker - docker

I am unable to start a Flink JobManager Docker container on M1 MacBook running Monterey. Below is the docker command pulled from the Flink Docs and the resulting java.io.IOException
docker run \
--rm \
--name=jobmanager \
--network flink-network \
--publish 8081:8081 \
--env FLINK_PROPERTIES="${FLINK_PROPERTIES}" \
flink:1.16.0-scala_2.12 jobmanager
INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Shutting StandaloneSessionClusterEntrypoint down with application status FAILED. Diagnostics java.io.IOException: Could not create the working directory /tmp/jm_ba47b82cf8d85068faa1c41d30126b5d.
at org.apache.flink.runtime.entrypoint.WorkingDirectory.createDirectory(WorkingDirectory.java:58)
at org.apache.flink.runtime.entrypoint.WorkingDirectory.<init>(WorkingDirectory.java:39)
at org.apache.flink.runtime.entrypoint.WorkingDirectory.create(WorkingDirectory.java:88)
at org.apache.flink.runtime.entrypoint.ClusterEntrypointUtils.lambda$createJobManagerWorkingDirectory$2(ClusterEntrypointUtils.java:241)
at org.apache.flink.runtime.entrypoint.DeterminismEnvelope.map(DeterminismEnvelope.java:49)
at org.apache.flink.runtime.entrypoint.ClusterEntrypointUtils.createJobManagerWorkingDirectory(ClusterEntrypointUtils.java:239)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:356)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:282)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$1(ClusterEntrypoint.java:232)
at org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:229)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:729)
at org.apache.flink.runtime.entrypoint.StandaloneSessionClusterEntrypoint.main(StandaloneSessionClusterEntrypoint.java:59)
My docker version is:
Client:
Cloud integration: v1.0.29
Version: 20.10.22
API version: 1.41
Go version: go1.18.9
Git commit: 3a2c30b
Built: Thu Dec 15 22:28:41 2022
OS/Arch: darwin/arm64
Context: default
Experimental: true
Why does the container not allow creating the required directory?

Looks like this was related to disk space issue. Running docker system prune cleared some space and the container is running nicely.

Related

Why can't I lookup other container by DNS in container

According to official Docker's doc, Docker will create DNS server when it started which makes it able to query other container directly by container ID or name.
containers that use a custom network use Docker’s embedded DNS server, which forwards external DNS lookups to the DNS servers configured on the host.
But when I trying to use nslookup directly in container it failed to lookup but wget still success! What makes it different?
Reproduce steps:
docker network create my-net
docker run -d --name web --network my-net httpd
docker run -it --rm --network my-net busybox
after inside busybox:
$ wget -q -O - web
<html>...some content...</html>
It works great! but use nslookup will failed:
$ nslookup web
Server: 127.0.0.11
Address: 127.0.0.11:53
Non-authoritative answer:
*** Can't find web: No answer
This is my docker's version:
$ docker version
Client: Docker Engine - Community
Version: 20.10.21
API version: 1.41
Go version: go1.19.2
Git commit: baeda1f82a
Built: Tue Oct 25 17:53:02 2022
OS/Arch: darwin/amd64
Context: colima
Experimental: true
Server:
Engine:
Version: 20.10.18
API version: 1.41 (minimum version 1.12)
Go version: go1.18.6
Git commit: e42327a6d3c55ceda3bd5475be7aae6036d02db3
Built: Sun Sep 11 07:10:00 2022
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: v1.6.8
GitCommit: 9cd3357b7fd7218e4aec3eae239db1f68a5a6ec6
runc:
Version: 1.1.4
GitCommit: 5fd4c4d144137e991c4acebb2146ab1483a97925
docker-init:
Version: 0.19.0
GitCommit:
While reproducing your issue I noticed that nslookup failed for any query (e.g., nslookup google.com also failed. Afterwards, I tried spinning up an ubuntu container on the same network and there both wget and nslookup worked fine. I do not know the exact reason why this is so, but my guess is that wget and nslookup rely on some system functionalities which are different for busybox and for ubuntu.

Docker swarm service environment variable is not visible within container

I am creating a docker service with an environment variable:
docker service create --env TEST=123 myservice
And I verify the environment variable was set with
$ docker service inspect myservice
...
ContainerSpec:
Env: TEST=123
...
But then the environment variable does not show up within the docker container. In particular, the code running inside the docker container prints os.Environ() and in the logs I see only the standard environment variable:
$ docker service logs myservice
[HOSTNAME=48bcddab9204 SHLVL=1 HOME=/root PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin PWD=/app]
So the environment variable "TEST" is not set. How can I set up a service that defines this environment variable?
The output of docker version is
Client:
Version: 20.10.12
API version: 1.41
Go version: go1.17.5
Git commit: e91ed5707e
Built: Mon Dec 13 22:31:40 2021
OS/Arch: linux/amd64
Context: synology
Experimental: true
Server:
Engine:
Version: 20.10.3
API version: 1.41 (minimum version 1.12)
Go version: go1.15.13
Git commit: a3bc36f
Built: Thu Aug 19 07:11:25 2021
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: v1.4.3
GitCommit: ea3508454ff2268c32720eb4d2fc9816d6f75f88
runc:
Version: v1.0.0-rc93
GitCommit: 31cc25f16f5eba4d0f53e35374532873744f4b31
docker-init:
Version: 0.19.0
GitCommit: ed96d00
Variables should be defined in the environment when you use --env. If that's not happening, we'd need a complete example to reproduce. Here's an example showing that it works:
$ docker service create --name env-test --env TESTVAR=123 busybox tail -f /dev/null
nj9l6z57d9pviztyp9pglmv4r
overall progress: 1 out of 1 tasks
1/1: running [==================================================>]
verify: Service converged
$ docker ps -l
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
697c1bf221b1 busybox:latest "tail -f /dev/null" 20 seconds ago Up 19 seconds env-test.1.vlrepc8mqqvx7gysh2qsymja8
$ docker exec -it env-test.1.vlrepc8mqqvx7gysh2qsymja8 env
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=697c1bf221b1
TERM=xterm
TESTVAR=123
HOME=/root

How to communicate between Docker container and Linux host over SCTP

I tried to listen to SCTP on the docker container and connect to there from the Linux host machine. But it seems the connection timed out.
Is there any way to communicate between host and container over SCTP?
FYI: It looks container-to-container SCTP communication works fine.
Detailed information is the following:
Dockerfile for test-container
FROM ubuntu:focal
RUN apt update -y && apt install -y ncat
docker run
$ sudo docker run --rm --name sctp-server -p 9999:9999/sctp test-container:latest ncat --sctp -lv 9999
SCTP request (timeout)
$ ncat --sctp 127.0.0.1 9999
Ncat: TIMEOUT.
docker version
Client: Docker Engine - Community
Version: 19.03.13
API version: 1.40
Go version: go1.13.15
Git commit: 4484c46d9d
Built: Wed Sep 16 17:02:52 2020
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 19.03.13
API version: 1.40 (minimum version 1.12)
Go version: go1.13.15
Git commit: 4484c46d9d
Built: Wed Sep 16 17:01:20 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.3.7
GitCommit: 8fba4e9a7d01810a393d5d25a3621dc101981175
runc:
Version: 1.0.0-rc10
GitCommit: dc9208a3303feef5b3839f4323d9beb36df0a9dd
docker-init:
Version: 0.18.0
GitCommit: fec3683
FYI: Container-to-Container SCTP communication (works fine)
$ sudo docker network create -d bridge sctp
$ sudo docker run --rm --name sctp-server --net=sctp sctp-test:latest ncat --sctp -lv 9999
$ sudo docker run --rm --name sctp-client --net=sctp sctp-test:latest ncat --sctp 172.18.0.2 9999
Finally, I found the cause of this problem.
The reason for the timeout is it used the same SCTP port between host and container.
When I launched a container with
different ports like sudo docker run --rm --name sctp-server -p 19999:9999/sctp test-container:latest ncat --sctp -lv 9999 and run ncat --sctp 127.0.0.1 19999 on the host machine, it worked fine.
I'm not confident but I suspect the behavior of iptables.

Error with Docker daemon for docker installation on Fiware cloud

I am new with the Fiware and docker technologies so I need some help.
I am following the instructions from this link http://simple-docker-hosting-on-fiware-cloud.readthedocs.io/en/v1.0/manuals/install in order to create a docker-host machine on Fiware cloud but when I run the following command:
docker-machine create -d openstack --openstack-flavor-id="2" --openstack-image-name="base_ubuntu_14.04" --openstack-net-name="node-int-net-01" --openstack-floatingip-pool="public-ext-net-01" --openstack-sec-groups="docker-sg" --openstack-ssh-user "ubuntu" docker-host
I receive the following error:
Error creating machine: Error running provisioning: Unable to verify the Docker daemon is listening: Maximum number of retries (10) exceeded
Although, I can see the instance of the docker-host machine on Fiware cloud, but when I run the following command:
eval "$(docker-machine env docker-host)"
the following error comes up:
Error checking TLS connection: Error checking and/or regenerating the certs: There was an error validating certificates for host "147.27.60.136:2376": dial tcp 147.27.60.136:2376: connectex: No connection could be made because the target machine actively refused it.
You can attempt to regenerate them using 'docker-machine regenerate-certs [name]'.
Be advised that this will trigger a Docker daemon restart which might stop running containers.**
I also tried to regenerate the certificates:
docker-machine regenerate-certs docker-host
but I received the following error:
Error getting SSH command to check if the daemon is up: ssh command error:
command : sudo docker version
err : exit status 1
output : Client:
Version: 18.04.0-ce
API version: 1.37
Go version: go1.9.4
Git commit: 3d479c0
Built: Tue Apr 10 18:21:14 2018
OS/Arch: linux/amd64
Experimental: false
Orchestrator: swarm
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?**
Image with the result for the Command: docker-machine ls
What am I doing wrong?
I use docker community edition for windows 10.
The docker version is:
Client:
Version: 18.03.0-ce
API version: 1.37
Go version: go1.9.4
Git commit: 0520e24
Built: Wed Mar 21 23:06:28 2018
OS/Arch: windows/amd64
Experimental: false
Orchestrator: swarm
Server:
Version: 18.03.0-ce
API version: 1.37 (minimum version 1.12)
Go version: go1.9.4
Git commit: 0520e24
Built: Wed Mar 21 23:14:32 2018
OS/Arch: linux/amd64
Experimental: false
First make sure you've opened your docker port (tcp/2376) in your default security group
Let me suggest you using base_ubuntu_16.04 instead of base_ubuntu_14.04
Anyway, it won't run properly at first. There is a problem with the latests versions of docker and docker-machine. As a workaround, after running your docker-machine command, you can do this to fix the problem:
ssh docker-host 'sudo apt-get -y install linux-image-extra-$(uname -r) linux-image-extra-virtual ; sudo modprobe aufs ; sudo service docker start'
However, you might find furhter problems due to MTU configuration in your docker host. To solve them, you can lower your MTU with these commands:
docker-machine ssh docker-host "sudo sed -i 's/--label provider=openstack/--label provider=openstack\n--mtu=1400/g' /etc/default/docker"
docker-machine ssh docker-host "sudo service docker restart"
docker-machine ssh docker-host "sudo ip link set mtu 1400 dev docker0"

File copied by Docker seen as a directory

I'm trying to dockerize a Stardog 3.1.3 community edition server. The container fails to start because it sees a directory instead of a license file. For the record, I'm on Windows. These are the steps I'm following:
Create a data container
docker create -v /data/stardog:/data/stardog --name stardog_data busybox /bin/true
Copy the local license key to the data container (not done in the Dockerfile that is mentioned below as the license is environment specific)
docker cp .\stardog\stardog-license-key.bin stardog_data:/stardog-license-key.bin
Create an image based on the following Dockerfile
docker build -t me/stardog .
FROM java:openjdk-8-jdk
ENV STARDOG_VER stardog-3.1.3
ENV STARDOG_HOME /data/stardog
COPY ${LOCAL_PATH}/${STARDOG_VER}.zip /
RUN unzip ${STARDOG_VER}.zip
WORKDIR /${STARDOG_VER}
CMD rm $STARDOG_HOME/system.lock || true && bin/stardog-admin server start && (tail -f $STARDOG_HOME/stardog.log &) && while (pidof java > /dev/null); do sleep 1; done
Run a Stardog container
docker run -d -p 5820:5820 --volumes-from stardog_data --name stardog me/stardog
When I execute docker ps -a, I see that the container is stopped after a couple of seconds:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
9559b22473e1 me/stardog "/bin/sh -c 'rm $STAR" 26 minutes ago Exited (1) 26 minutes ago stardog
2b929329e35e busybox "/bin/true" 32 minutes ago Created stardog_data
When I check the logs with docker logs stardog, I'm getting this:
com.clarkparsia.license.InvalidLicenseException: java.io.FileNotFoundException: /data/stardog/stardog-license-key.bin (Is a directory)
at com.clarkparsia.license.LicenseValidator.validate(LicenseValidator.java:157)
at com.complexible.stardog.StardogLicense.findLicense(StardogLicense.java:127)
at com.complexible.stardog.StardogLicense.<init>(StardogLicense.java:70)
at com.complexible.stardog.Stardog.<init>(Stardog.java:158)
at com.complexible.stardog.Stardog.initialize(Stardog.java:263)
at com.complexible.stardog.Stardog.initialize(Stardog.java:254)
at com.complexible.stardog.Stardog.buildServer(Stardog.java:247)
at com.complexible.stardog.cli.impl.ServerStart.call(ServerStart.java:144)
at com.complexible.stardog.cli.impl.ServerStart.call(ServerStart.java:47)
at com.complexible.stardog.cli.CLIBase.execute(CLIBase.java:54)
at com.complexible.stardog.cli.admin.CLI.main(CLI.java:194)
Caused by: java.io.FileNotFoundException: /data/stardog/stardog-license-key.bin (Is a directory)
at java.io.FileInputStream.open0(Native Method)
at java.io.FileInputStream.open(FileInputStream.java:195)
at java.io.FileInputStream.<init>(FileInputStream.java:138)
at com.clarkparsia.license.LicenseValidator.validate(LicenseValidator.java:113)
... 10 more
Your Stardog license is invalid. Please contact support#clarkparsia.com for information on obtaining a new license.
It seems that the license file is considered to be a directory. What am I doing wrong?
I'm using the following Docker version:
Client:
Version: 1.10.3
API version: 1.22
Go version: go1.5.3
Git commit: 20f81dd
Built: Thu Mar 10 21:49:11 2016
OS/Arch: windows/amd64
Server:
Version: 1.10.3
API version: 1.22
Go version: go1.5.3
Git commit: 20f81dd
Built: Thu Mar 10 21:49:11 2016
OS/Arch: linux/amd64
This seems to be a Windows related problem. I've tried these exact same steps on a native Ubuntu (14.04) machine and it works as expected. I've filed this as a bug and hopefully this gets fixed soon.

Resources