Setting multiple DOCKER_OPTS arguments - docker

If you want to pass an option to the Docker Engine at startup on Ubuntu, you can edit the /etc/defaults/docker file.
Here I'm setting the storage driver to AUFS:
DOCKER_OPTS="--storage-driver=aufs"
However, if I pass more than one argument, Docker doesn't start. For example:
DOCKER_OPTS="--insecure-registry=0.0.0.0:5000 --storage-driver=aufs"
Now Docker fails to start:
# service docker stop && service docker start
docker start/running, process 31569
# service docker status
docker stop/waiting
From /var/log/syslog:
Mar 11 14:55:30 myhost kernel: [ 2788.030270] init: docker main process (31253) terminated with status 1
Mar 11 14:55:30 myhost kernel: [ 2788.030279] init: docker main process ended, respawning
Mar 11 14:55:30 myhost kernel: [ 2788.085931] init: docker main process (31287) terminated with status 1
Mar 11 14:55:30 myhost kernel: [ 2788.085940] init: docker respawning too fast, stopped
Each argument works on its own, but if passed together the Docker service refuses to start. I am using Docker version 1.10.3, build 20f81dd on Ubuntu 14.04 3.13.0-74-generic.
How can I pass more than one argument to DOCKER_OPTS?

The arguments must be separated by ,
This format works:
DOCKER_OPTS="--insecure-registry=0.0.0.0:5000,--storage-driver=aufs"

Related

Fedora 36 - Docker - Cannot connect to the Docker daemon [closed]

Closed. This question is not about programming or software development. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 3 months ago.
Improve this question
OS: Fedora 36
I noticed this when my docker containers stopped working out of the blue. Fedora said the docker-compose stopped working. After system updates and a restart, I did the following:
sudo service docker start
Which worked, as I then did sudo service docker status
redirecting to /bin/systemctl status docker.service
● docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; disabled; vendor p>
Active: active (running) since Wed 2022-09-14 10:29:01 MDT; 1s ago
TriggeredBy: ● docker.socket
Docs: https://docs.docker.com
Main PID: 2778 (dockerd)
Tasks: 22
Memory: 114.0M
CPU: 347ms
CGroup: /system.slice/docker.service
└─ 2778 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/con>
Sep 14 10:29:00 fedora dockerd[2778]: time="2022-09-14T10:29:00.385376990-06:00>
Sep 14 10:29:00 fedora dockerd[2778]: time="2022-09-14T10:29:00.439821904-06:00>
Sep 14 10:29:00 fedora dockerd[2778]: time="2022-09-14T10:29:00.696795461-06:00>
Sep 14 10:29:00 fedora dockerd[2778]: time="2022-09-14T10:29:00.839972916-06:00>
Sep 14 10:29:00 fedora dockerd[2778]: time="2022-09-14T10:29:00.895624616-06:00>
Sep 14 10:29:00 fedora dockerd[2778]: time="2022-09-14T10:29:00.994809032-06:00>
Sep 14 10:29:01 fedora dockerd[2778]: time="2022-09-14T10:29:01.017873180-06:00>
Sep 14 10:29:01 fedora dockerd[2778]: time="2022-09-14T10:29:01.018007624-06:00>
Sep 14 10:29:01 fedora systemd[1]: Started docker.service - Docker Application >
Sep 14 10:29:01 fedora dockerd[2778]: time="2022-09-14T10:29:01.035944310-06:00>
So I can see it is working, it is running. I ran this again five minutes later, same result.
Next I ran docker ps -a and got:
Cannot connect to the Docker daemon at
unix:///home/XXXXX/.docker/desktop/docker.sock. Is the docker daemon
running?
Which is odd, so next I checked who owns the docker.sock:
sudo ls -la /var/run/docker.sock
srw-rw---- 1 root docker 0 Sep 14 10:29 /var/run/docker.sock
For some reason its owned by root, so I decided to change it my user:
sudo chown XXXXX:docker /var/run/docker.sock
Now it shows as me: XXXXX:docker - blanked out user name:
srw-rw---- 1 XXXXX docker 0 Sep 14 10:29 /var/run/docker.sock
Now we stop and start again, as above. As above it is also running after doing sudo service docker status
Now if I try and do docker ps -a I still get:
Cannot connect to the Docker daemon at
unix:///home/XXXXX/.docker/desktop/docker.sock. Is the docker
daemon running?
I have googled, and I have searched, but I am so confused, docker is running - but apparently it's not running?
How do I fix this?
The only thing I can think of is blowing away docker completely and re-installing, but that seems drastic.
Every where I look its:
Make sure its running - check
Change owner/group of of the sock file - done
restart docker - done
Check status - done
Another thing that I stumbled upon was:
sudo dockerd
Which gave me a bunch of output but at the end it was:
failed to start daemon: error while opening volume store metadata
database: timeout
I'm also running Fedora 36.
Docker is looking in your home directory for docker.sock
/home/XXXXX/.docker/desktop/docker.sock
This is most likely because you had / still have docker desktop installed.
If you remove docker desktop there is a possibility that the .docker folder would still be present in your home directory. When reinstalling the docker engine instead of installing docker desktop the daemon would still try and connect to the docker.sock in the aboce directory.
I had the same problem and solved it by doing the following:
Uninstall Docker Desktop
Unistall Docker Engine as per the installation guide
delete the .docker folder rm -R ~/.docker
Continue installing docker engine as per the installation guide
Add your user to the docker group sudo usermod -aG docker $USER
Do one of the following inorder to tell the terminal the group has updated
run newgrp docker
OR Logout out and log back in of your machine
OR Restart your machine
Run docker run hello-world to confirm if its working now
The above was what fixed the issue on my machine and there is always the possibility that our setups arent exactly the same.

Why is Loki's Docker Driver Client stopping to log after some time?

I want to send logs of my Docker containers to Grafana Loki. Therefore, I installed Loki's Docker Driver Client and started my containers with it. First I can see logs, but after some time I see no more logs.
Installation
I installed Loki's Docker Driver Client as a Docker plugin on my Docker Engine (version 20.10.2):
$ docker plugin install grafana/loki-docker-driver:master-54d1d3b --alias loki --grant-all-permissions
I didn't use the tag lastest, because of the bug Unable to connect to logging plugin in Swarm
Configuration
I started my Docker containers with Loki's Docker Driver Client as log driver:
$ docker container run
--log-driver=loki
--log-opt loki-url="$LOKI_URL"
--log-opt loki-retries=5
--log-opt loki-batch-size=400
--log-opt max-size="10m"
--log-opt max-file=5
--detach
--name $CONTAINER_NAME
--restart unless-stopped
$IMAGE:$TAG
I also added json-log driver's max-size and max-file to limit disk space, see Configuring the Docker Driver.
Problem
First I could see logs in Grafana and in command line with docker container logs, but after some time no more logs were shown. If I tried to look into the logs on Docker host and I saw an error:
$ docker container logs 75d4b13eb3e8
error from daemon in stream: Error grabbing logs: error getting log reader: LogDriver.ReadLogs: logger does not exist for 75d4b13eb3e8203b9247ecdeb41fdf495cc8fea7dcfc4775fd8261263b1dcd32
Research
I looked into the directories of the containers (see Where is a log file with logs from a container?), but I couldn't see any log files:
$ sudo ls /var/lib/docker/containers/75d4b13eb3e8203b9247ecdeb41fdf495cc8fea7dcfc4775fd8261263b1dcd32
checkpoints config.v2.json hostconfig.json hostname hosts mounts resolv.conf resolv.conf.hash
I also checked the log path (see Get an instance’s log path), but it was empty:
$ docker inspect --format='{{.LogPath}}' 75d4b13eb3e8
I found container's logs in plugin's directory (see Loki log driver not storing logs as files on disk, even with keep-file: true), but the log files don't change anymore:
$ sudo ls -la /var/lib/docker/plugins/eac33cc9913ca962a189904392e516dd495d6fd52391fb5af4a34af46b281288/rootfs/var/log/docker/75d4b13eb3e8203b9247ecdeb41fdf495cc8fea7dcfc4775fd8261263b1dcd32
total 912
drwxr-xr-x 2 root root 4096 Jan 22 12:59 .
drwxr-xr-x 17 root root 4096 Jan 22 15:46 ..
-rw-r----- 1 root root 923177 Jan 22 13:34 json.log
I looked into Docker daemon's logs (see Read the logs) and found errors and a warning (at the same time logging stopped):
$ sudo journalctl -u docker.service | grep eac33cc9913c
[...]
[...]level=error msg="panic: send on closed channel" plugin=eac33cc9913ca962a189904392e516dd495d6fd52391fb5af4a34af46b281288
[...]level=error plugin=eac33cc9913ca962a189904392e516dd495d6fd52391fb5af4a34af46b281288
[...]level=error msg="goroutine 153 [running]:" plugin=eac33cc9913ca962a189904392e516dd495d6fd52391fb5af4a34af46b281288
[...]level=error msg="main.(*loki).Log(0xc0000c5e00, 0xc0001d81c0, 0xc0000c5e80, 0x0)" plugin=eac33cc9913ca962a189904392e516dd495d6fd52391fb5af4a34af46b281288
[...]level=error msg="\t/src/loki/cmd/docker-driver/loki.go:69 +0x2fb" plugin=eac33cc9913ca962a189904392e516dd495d6fd52391fb5af4a34af46b281288
[...]level=error msg="main.consumeLog(0xc0002c0480)" plugin=eac33cc9913ca962a189904392e516dd495d6fd52391fb5af4a34af46b281288
[...]level=error msg="\t/src/loki/cmd/docker-driver/driver.go:165 +0x4c2" plugin=eac33cc9913ca962a189904392e516dd495d6fd52391fb5af4a34af46b281288
[...]level=error msg="created by main.(*driver).StartLogging" plugin=eac33cc9913ca962a189904392e516dd495d6fd52391fb5af4a34af46b281288
[...]level=error msg="\t/src/loki/cmd/docker-driver/driver.go:116 +0xa75" plugin=eac33cc9913ca962a189904392e516dd495d6fd52391fb5af4a34af46b281288
[...]level=warning msg="Unable to connect to plugin: /run/docker/plugins/eac33cc9913ca962a189904392e516dd495d6fd52391fb5af4a34af46b281288/loki.sock/LogDriver.StopLogging: Post http://%2Frun%2Fdocker%2Fplugins%2Feac33cc9913ca962a189904392e516dd495d6fd52391fb5af4a34af46b281288%2Floki.sock/LogDriver.StopLogging: EOF, retrying in 1s"
[...]
What did I do wrong?
I was experiencing the same issue.
My only differences in configuration are that I'm trialing the latest Enterprise Edition (19.03) as it brings dual logging capability although this is also supported in the latest CE versions, and I'm using the latest Loki Docker driver client now that the Github issue previously mentioned has been resolved.
I ended up setting the log-opts properties no-file and keep-file in docker-compose.yml:
logging:
driver: "loki"
options:
loki-url: "http://${LOKI_URL}:3100/loki/api/v1/push"
loki-batch-size: "400"
no-file: "false"
keep-file: "true"
max-size: "5m"
max-file: "3"
Since making this change I am receiving logs in Loki and can still use docker container logs and docker service logs on my Docker hosts.
no-file: "false" tells the driver to continue creating logs on disk and keep-file: "true" tells the driver to keep json logs if the container is stopped (by default files are removed).
Note: Originally I was adding these settings to /etc/docker/daemon.json on the host but would still see the error getting log reader issue, I had to switch to specifying the log driver per container/swarm service.
Regarding this issue
First I could see logs in Grafana and in command line with docker container logs, but after some time no more logs were shown.
On Grafana please select Query type: Range not Instant and you will see all the logs for the selected period of time, if exists in loki.

When using mesos, marathon, and zookeeper my mesos-slave doesnt start when I specify the "containerizers" file with "docker,mesos"?

I have 3 CentOS VMs and I have installed Zookeeper, Marathon, and Mesos on the master node, while only putting Mesos on the other 2 VMs. The master node has no mesos-slave running on it. I am trying to run Docker containers so i specified "docker,mesos" in the containerizes file. One of the mesos-agents starts fine with this configuration and I have been able to deploy a container to that slave. However, the second mesos-agent simply fails when I have this configuration (it works if i take out that containerizes file but then it doesn't run containers). Here are some of the logs and information that has come up:
Here are some "messages" in the log directory:
Apr 26 16:09:12 centos-minion-3 systemd: Started Mesos Slave.
Apr 26 16:09:12 centos-minion-3 systemd: Starting Mesos Slave...
WARNING: Logging before InitGoogleLogging() is written to STDERR
[main.cpp:243] Build: 2017-04-12 16:39:09 by centos
[main.cpp:244] Version: 1.2.0
[main.cpp:247] Git tag: 1.2.0
[main.cpp:251] Git SHA: de306b5786de3c221bae1457c6f2ccaeb38eef9f
[logging.cpp:194] INFO level logging started!
[systemd.cpp:238] systemd version `219` detected
[main.cpp:342] Inializing systemd state
[systemd.cpp:326] Started systemd slice `mesos_executors.slice`
[containerizer.cpp:220] Using isolation: posix/cpu,posix/mem,filesystem/posix,network/cni
[linux_launcher.cpp:150] Using /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher
[provisioner.cpp:249] Using default backend 'copy'
[slave.cpp:211] Mesos agent started on (1)#172.22.150.87:5051
[slave.cpp:212] Flags at startup: --appc_simple_discovery_uri_prefix="http://" --appc_store_dir="/tmp/mesos/store/appc" --authenticate_http_readonly="false" --authenticate_http_readwrite="false" --authenticatee="crammd5" --authentication_backoff_factor="1secs" --authorizer="local" --cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false" --cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false" --cgroups_root="mesos" --container_disk_watch_interval="15secs" --containerizers="docker,mesos" --default_role="*" --disk_watch_interval="1mins" --docker="docker" --docker_kill_orphans="true" --docker_registry="https://registry-1.docker.io" --docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns" --docker_store_dir="/tmp/mesos/store/docker" --docker_volume_checkpoint_dir="/var/run/mesos/isolators/docker/volume" --enforce_container_disk_quota="false" --executor_registration_timeout="1mins" --executor_shutdown_grace_period="5secs" --fetcher_cache_dir="/tmp/mesos/fetch" --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" --hadoop_home="" --help="false" --hostname_lookup="true" --http_authenticators="basic" --http_command_executor="false" --http_heartbeat_interval="30secs" --initialize_driver_logging="true" --isolation="posix/cpu,posix/mem" --launcher="linux" --launcher_dir="/usr/libexec/mesos" --log_dir="/var/log/mesos" --logbufsecs="0" --logging_level="INFO" --max_completed_executors_per_framework="150" --oversubscribed_resources_interval="15secs" --perf_duration="10secs" --perf_interval="1mins" --qos_correction_interval_min="0ns" --quiet="false" --recover="reconnect" --recovery_timeout="15mins" --registration_backoff_factor="1secs" --revocable_cpu_low_priority="true" --runtime_dir="/var/run/mesos" --sandbox_directory="/mnt/mesos/sandbox" --strict="true" --switch_user="true" --systemd_enable_support="true" --systemd_runtime_directory="/run/systemd/system" --version="false" --work_dir="/var/lib/mesos"
[slave.cpp:541] Agent resources: cpus(*):1; mem(*):919; disk(*):2043; ports(*):[31000-32000]
[slave.cpp:549] Agent attributes: [ ]
[slave.cpp:554] Agent hostname: node3
[status_update_manager.cpp:177] Pausing sending status updates
[state.cpp:62] Recovering state from '/var/lib/mesos/meta'
[state.cpp:706] No committed checkpointed resources found at '/var/lib/mesos/meta/resources/resources.info'
[status_update_manager.cpp:203] Recovering status update manager
[docker.cpp:868] Recovering Docker containers
[containerizer.cpp:599] Recovering containerizer
[provisioner.cpp:410] Provisioner recovery complete
[group.cpp:340] Group process (zookeeper-group(1)#172.22.150.87:5051) connected to ZooKeeper
[group.cpp:830] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0)
[group.cpp:418] Trying to create path '/mesos' in ZooKeeper
[detector.cpp:152] Detected a new leader: (id='15')
[group.cpp:699] Trying to get '/mesos/json.info_0000000015' in ZooKeeper
[zookeeper.cpp:259] A new leading master (UPID=master#172.22.150.88:5050) is detected
Failed to perform recovery: Collect failed: Failed to run 'docker -H unix:///var/run/docker.sock ps -a': exited with status 1; stderr='Cannot connect to the Docker daemon. Is the docker daemon running on this host?'
To remedy this do as follows:
Step 1: rm -f /var/lib/mesos/meta/slaves/latest
This ensures agent doesn't recover old live executors.
Step 2: Restart the agent.
Apr 26 16:09:13 centos-minion-3 systemd: mesos-slave.service: main process exited, code=exited, status=1/FAILURE
Apr 26 16:09:13 centos-minion-3 systemd: Unit mesos-slave.service entered failed state.
Apr 26 16:09:13 centos-minion-3 systemd: mesos-slave.service failed.
Logs from docker:
$ sudo systemctl status docker
● docker.service - Docker Application Container Engine Loaded:
loaded (/usr/lib/systemd/system/docker.service; disabled; vendor preset: disabled)
Drop-In: /usr/lib/systemd/system/docker.service.d
└─flannel.conf Active: inactive (dead) since Tue 2017-04-25 18:00:03 CDT;
24h ago Docs: docs.docker.com Main PID: 872 (code=exited, status=0/SUCCESS)
Apr 26 18:25:25 centos-minion-3 systemd[1]: Dependency failed for Docker Application Container Engine.
Apr 26 18:25:25 centos-minion-3 systemd[1]: Job docker.service/start failed with result 'dependency'
Logs from flannel:
[flanneld-start: network.go:102] failed to retrieve network config: client: etcd cluster is unavailable or misconfigured
You have answer in your logs
Failed to perform recovery: Collect failed:
Failed to run 'docker -H unix:///var/run/docker.sock ps -a': exited with status 1;
stderr='Cannot connect to the Docker daemon. Is the docker daemon running on this host?'
To remedy this do as follows:
Step 1: rm -f /var/lib/mesos/meta/slaves/latest
This ensures agent doesn't recover old live executors.
Step 2: Restart the agent.
Mesos keeps it state/metadata on local disk. When it's restarted it try to load this state. If configuration changed and is not compatible with previous state it won't start.
Just bring docker to live by fixing problems with flannel and etcd and everything will be fine.
add the following flag while starting agent,
--reconfiguration_policy=additive
more details here: http://mesos.apache.org/documentation/latest/agent-recovery/

Docker fails to start due to "volume store metadata database: timeout"

I have followed the installation instructions of Docker CE for CentOS. Initially this worked. At some point the system was restarted and now starting Docker fails. Appreciate expert eyes on this matter...
systemctl start docker produces:
Job for docker.service failed because the control process exited with error code. See "systemctl status docker.service" and "journalctl -xe" for details.
systemctl status docker.service produces:
Apr 21 11:25:23 sec-services-build-1 systemd[1]: Starting Docker Application Container Engine...
Apr 21 11:25:23 sec-services-build-1 dockerd[9693]: time="2017-04-21T11:25:23.370390797+03:00" level=info msg="libcontainerd: previous instance of containerd still alive (8908)"
Apr 21 11:25:23 sec-services-build-1 dockerd[9693]: time="2017-04-21T11:25:23.382492171+03:00" level=warning msg="overlay: the backing xfs filesystem is formatted without d_type support, which leads to incorrect behavior. Reformat the filesystem with ftype=1 to enable d_type support. Running without d_type support will no longer be supported in Docker 17.12."
Apr 21 11:25:23 sec-services-build-1 dockerd[9693]: time="2017-04-21T11:25:23.382547668+03:00" level=info msg="[graphdriver] using prior storage driver: overlay"
Apr 21 11:25:24 sec-services-build-1 dockerd[9693]: Error starting daemon: error while opening volume store metadata database: timeout
Apr 21 11:25:24 sec-services-build-1 systemd[1]: docker.service: main process exited, code=exited, status=1/FAILURE
Apr 21 11:25:24 sec-services-build-1 systemd[1]: Failed to start Docker Application Container Engine.
Apr 21 11:25:24 sec-services-build-1 systemd[1]: Unit docker.service entered failed state.
Apr 21 11:25:24 sec-services-build-1 systemd[1]: docker.service failed.
From here: https://github.com/moby/moby/issues/22507
I ran:
ps axf | grep docker | grep -v grep | awk '{print "kill -9 " $1}' | sudo sh
I was then able to restart docker using:
sudo systemctl start docker
Step 1: systemctl status docker (if docker is running) stop the docker.
step 2: systemctl stop docker.
step 3: dockerd
i got this message when copying volumes from production machine, ended up to overwrite metadata.db inside /var/lib/docker/volumes, then it crashes. A fix is so simple
docker system prune --volumes -f && rm /var/lib/docker/volumes/metadata.db && docker-compose up -d
I encountered the same error.
❶tried
sudo kill -9 1452
multiple times, but it doesn't work. There's still a dockerd process active.
1452 ? Zsl 127:42 [dockerd] <defunct>
❷tried as #Artur Mustafin suggested:
sudo mv /var/lib/docker/volumes/metadata.db /var/lib/docker/volumes/metadata.db.bk
it worked.
so I tried all of these and nothing worked. However what worked was removing all the containers from /var/lib/docker/containers. Then i killed all docker processes (ps -ef | grep docker) then restarted docker and the docker socket. When docker became active I added the containers one at a time and 1 container was what caused the issues

Docker daemon no start

I'm trying to start the Docker daemon:
sudo systemctl start docker
But nothing happens, the cursor just blinks and the process never ends.
Yesterday it was working properly :(
sudo journalctl -fu docker
ago 18 16:05:24 host docker[1602]: time="2016-08-18T16:05:24.467635627-05:00" level=info msg="New containerd process, pid: 1609\n"
ago 18 16:05:24 host docker[1602]: time="2016-08-18T16:05:24.482107319-05:00" level=fatal msg="bad listen address format /var/run/docker/libcontainerd/docker-containerd.sock, expected proto://address"
ago 18 16:05:30 host docker[1602]: time="2016-08-18T16:05:30.470570243-05:00" level=info msg="New containerd process, pid: 1620\n"
ago 18 16:05:30 host docker[1602]: time="2016-08-18T16:05:30.491495106-05:00" level=fatal msg="bad listen address format /var/run/docker/libcontainerd/docker-containerd.sock, expected proto://address"
ago 18 16:08:06 host systemd[1]: Stopped Docker Application Container Engine.
-- Reboot --
ago 18 16:16:52 host systemd[1]: Starting Docker Application Container Engine...
ago 18 16:16:54 host docker[2294]: time="2016-08-18T16:16:54.360878396-05:00" level=info msg="New containerd process, pid: 2327\n"
ago 18 16:16:54 host docker[2294]: time="2016-08-18T16:16:54.686503187-05:00" level=fatal msg="bad listen address format /var/run/docker/libcontainerd/docker-containerd.sock, expected proto://address"
ago 18 16:17:00 host docker[2294]: time="2016-08-18T16:17:00.664023288-05:00" level=info msg="New containerd process, pid: 2368\n"
ago 18 16:17:00 host docker[2294]: time="2016-08-18T16:17:00.67708602-05:00" level=fatal msg="bad listen address format /var/run/docker/libcontainerd/docker-containerd.sock, expected proto://address"
One interesting thing with systemd is that if it thinks that a daemon is running, then the start command does nothing.
I have had to do the following to make sure I cleanly restart certain daemons:
sudo systemctl stop service-name
# wait a little if the service is slow to stop like the Cassandra database
sudo systemctl start service-name
That has worked for me with various services.
One way to know whether the service is considered running, is to check the status like so:
systemctl status service-name

Resources