Docker service doesn't auto start after moving the docker image data directory to external drive location - docker

Following this page, I have moved the docker data directory and created a symbolic link to it. It works. But everytime after rebooting my computer, the Docker service doesn't start automatically any more. How can I solve this problem?
journalctl -u docker.service returns:
Jun 30 10:29:55 ubuntu systemd[1]: Starting Docker Application Container Engine...
Jun 30 10:29:55 ubuntu dockerd[2358]: time="2022-06-30T10:29:55.426467188+10:00" level=info msg="S>
Jun 30 10:29:55 ubuntu dockerd[2358]: mkdir /var/lib/docker: file exists
Jun 30 10:29:55 ubuntu systemd[1]: docker.service: Main process exited, code=exited, status=1/FAIL>
Jun 30 10:29:55 ubuntu systemd[1]: docker.service: Failed with result 'exit-code'.
Jun 30 10:29:55 ubuntu systemd[1]: Failed to start Docker Application Container Engine.
Jun 30 10:29:57 ubuntu systemd[1]: docker.service: Scheduled restart job, restart counter is at 3.
Jun 30 10:29:57 ubuntu systemd[1]: Stopped Docker Application Container Engine.
Jun 30 10:29:57 ubuntu systemd[1]: docker.service: Start request repeated too quickly.
Jun 30 10:29:57 ubuntu systemd[1]: docker.service: Failed with result 'exit-code'.
Jun 30 10:29:57 ubuntu systemd[1]: Failed to start Docker Application Container Engine.
Before moving the data directory "/var/lib/docker", it was a directory used by Docker, now it is a symbolic link that points to the external directory where the docker image data is stored. Why there is a mkdir command?
If I run dockerd, it returns:
INFO[2022-06-30T20:53:05.143671302+10:00] Starting up
dockerd needs to be started with root privileges. To run dockerd in rootless mode as an unprivileged user, see https://docs.docker.com/go/rootless/
If I run sudo service docker start, docker can start without error. But I don't want to run this everyday. Docker used to start automatically. Any ideas?

I was able to reproduce the error message with the same configuration:
systemd[1]: Starting Docker Application Container Engine...
dockerd[47623]: time="2022-06-30T16:36:20.047741616Z" level=in..
dockerd[47623]: mkdir /data/docker: file exists
systemd[1]: docker.service: Main process exited, code=exited, ..
The reason was that my external drive wasn't mounted yet.
Adding systemd mount/automount units resolve the issue. Or you can add your external drive to your /etc/fstab (Add nofail for avoid the 90 seconds wait when you don't have it with you).
Also from Docker doc:
You can configure the Docker daemon to use a different directory, using the data-root configuration option.
So editing your /etc/docker/daemon.json with:
{
"data-root": "/data/docker"
}
is probably better than using symlinks.

Related

Detect if docker should run or start a container on startup - using systemd but it keeps propagating errors - how to stop?

I'm trying to use a startup script on a Google Compute Engine instance to either:
If the docker container called rstudio is present but in stopped state, run docker start rstudio
If the docker container is not present, run rstudio run --name=rstudio rocker/rstudio
From this SO I thought this could be achieved via docker top rstudio || docker run --name=rstudio rocker/rstudio but it seems to always error at the docker top rstudio part. In that case, I have tried piping docker top rstudio &>/dev/null but no effect.
I have a cloud-config that runs when the instance boots up.
My problem is that the script to run or start the container keeps registering as an error, and doesn't go on to the logic of pulling the image. I have tried putting it in a seperate bash script and directly via ExecStart - also putting "-" in front of the ExecStart command (which is supposed to ignore errors?) but this also seems to have no effect. This is where I have ended up:
#cloud-config
users:
- name: gcer
uid: 2000
write_files:
- path: /home/gcer/docker-rstudio.sh
permissions: 0755
owner: root
content: |
#!/bin/bash
echo "Docker RStudio launch script"
if ! docker top rstudio &>/dev/null
then
echo "Pulling new rstudio"
docker run -p 80:8787 \
-e ROOT=TRUE \
-e USER=%s -e PASSWORD=%s \
-v /home/gcer:/home/rstudio \
--name=rstudio \
%s
else
echo "Starting existing rstudio"
docker start rstudio
fi
- path: /etc/systemd/system/rstudio.service
permissions: 0644
owner: root
content: |
[Unit]
Description=RStudio Server
Requires=docker.service
After=docker.service
[Service]
Restart=always
Environment="HOME=/home/gcer"
ExecStartPre=/usr/share/google/dockercfg_update.sh
ExecStart=-/home/gcer/docker-rstudio.sh
ExecStop=/usr/bin/docker stop rstudio
runcmd:
- systemctl daemon-reload
- systemctl start rstudio.service
Whatever I try, I end up with this error log when I run sudo journalctl -u rstudio.service
Feb 14 23:26:09 test-9 systemd[1]: Started RStudio Server.
Feb 14 23:26:09 test-9 docker[770]: Error response from daemon: No such container: rstudio
Feb 14 23:26:09 test-9 systemd[1]: rstudio.service: Control process exited, code=exited status=1
Feb 14 23:26:09 test-9 systemd[1]: rstudio.service: Unit entered failed state.
Feb 14 23:26:09 test-9 systemd[1]: rstudio.service: Failed with result 'exit-code'.
Feb 14 23:26:09 test-9 systemd[1]: rstudio.service: Service hold-off time over, scheduling restart.
Feb 14 23:26:09 test-9 systemd[1]: Stopped RStudio Server.
Feb 14 23:26:09 test-9 systemd[1]: Starting RStudio Server...
...
Feb 14 23:26:09 test-9 systemd[1]: Started RStudio Server.
Feb 14 23:26:09 test-9 docker[809]: Error response from daemon: No such container: rstudio
Feb 14 23:26:09 test-9 systemd[1]: rstudio.service: Control process exited, code=exited status=1
Feb 14 23:26:09 test-9 systemd[1]: rstudio.service: Unit entered failed state.
Feb 14 23:26:09 test-9 systemd[1]: rstudio.service: Failed with result 'exit-code'.
Feb 14 23:26:10 test-9 systemd[1]: rstudio.service: Service hold-off time over, scheduling restart.
Feb 14 23:26:10 test-9 systemd[1]: Stopped RStudio Server.
Feb 14 23:26:10 test-9 systemd[1]: rstudio.service: Start request repeated too quickly.
Feb 14 23:26:10 test-9 systemd[1]: Failed to start RStudio Server.
Feb 14 23:26:10 test-9 systemd[1]: rstudio.service: Unit entered failed state.
Feb 14 23:26:10 test-9 systemd[1]: rstudio.service: Failed with result 'exit-code'.
Can anyone help me get this working?
I would delete the container when you stop it. Then your startup script reduces to making extra sure the container is deleted, and unconditionally docker running it anew.
This would make the entire contents of the script be:
#!/bin/sh
docker stop rstudio
docker rm rstudio
docker run -p 80:8787 \
--name=rstudio \
... \
rstudio run --name=rstudio rocker/rstudio
Without the set -e option, even if the earlier commands fail (because the container doesn't exist) the script will still go on to the docker run command. This avoids any testing of trying to figure out whether a container is there or not and always leaves you in a consistent state.
Similarly, to clean up a little better, I'd change the systemd unit file to delete the container after it stops
ExecStop=/usr/bin/docker stop rstudio
ExecStopPost=/usr/bin/docker rm rstudio
(Your setup has three possible states: the container is running; the container exists but is stopped; and the container doesn't exist. My setup removes the "exists but is stopped" state, which doesn't have a whole lot of value, especially since you use a docker run -v option to store data outside of container space.)

docker.service failed. See 'journalctl -xe' for details

docker.service - Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
Active: inactive (dead) since Thu 2018-05-17 15:47:26 CEST; 17h ago
Docs: https://docs.docker.com
Main PID: 11843 (code=exited, status=0/SUCCESS)
CGroup: /system.slice/docker.service
May 18 08:48:38 temp systemd[1]: docker.service: Job docker.service/start failed with result 'dependency'.
May 18 08:49:09 temp systemd[1]: Stopped Docker Application Container Engine.
May 18 08:49:09 temp systemd[1]: Dependency failed for Docker Application Container Engine.
May 18 08:49:09 temp systemd[1]: docker.service: Job docker.service/start failed with result 'dependency'.
May 18 08:49:15 temp systemd[1]: Dependency failed for Docker Application Container Engine.
May 18 08:49:15 temmp systemd[1]: docker.service: Job docker.service/start failed with result 'dependency'.
May 18 09:00:03 temp systemd[1]: Dependency failed for Docker Application Container Engine.
May 18 09:00:03 temp systemd[1]: docker.service: Job docker.service/start failed with result 'dependency'.
May 18 09:03:51 temp systemd[1]: Dependency failed for Docker Application Container Engine.
May 18 09:03:51 temp systemd[1]: docker.service: Job docker.service/start failed with result 'dependency'.
Tried to uninstall docker and reinstalled it but it raises the same error is the docker daemon running can someone help me here.
There is a service that docker requires that is not running, thus, systemd won't launch docker.
Try launching journalctl -f (without -u) to see all unit logs, then start docker and read carefully the log, you will probably see some other units trying to start and failing.
You can find the reason why docker isn't starting by running
/usr/bin/dockerd -H unix://
In my case it was a fresh install of Centos7 with Docker 18.09
ERRO[2018-11-14T22:14:55.441548150+02:00] 'overlay' not found as a supported filesystem on this host. Please ensure kernel is new enough and has overlay support loaded. storage-driver=overlay2
ERRO[2018-11-14T22:14:55.444930007+02:00] AUFS was not found in /proc/filesystems storage-driver=aufs
ERRO[2018-11-14T22:14:55.447984399+02:00] 'overlay' not found as a supported filesystem on this host. Please ensure kernel is new enough and has overlay support loaded. storage-driver=overlay
To fix that, I had to upgrade to a newer kernel, and remove the current docker storage
rm -rf /var/lib/docker
Then docker started working
I have this problem on my machine. I don't have success to solve this issue.
But if you are in a hury you can do
/usr/bin/dockerd -H unix:///var/run/docker.sock
All classic commands will work (docker system, docker etc..)

docker service does not start after creating daemon.json

Following error message appears when doing the steps below
● docker.service - Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Wed 2017-08-30 09:21:52 CEST; 13s ago
Docs: https://docs.docker.com
Process: 11581 ExecStart=/usr/bin/dockerd -H fd:// (code=exited, status=1/FAILURE)
Main PID: 11581 (code=exited, status=1/FAILURE)
CPU: 28ms
Aug 30 09:21:52 debian systemd[1]: docker.service: Failed with result 'exit-code'.
Aug 30 09:21:52 debian systemd[1]: docker.service: Service hold-off time over, scheduling restart.
Aug 30 09:21:52 debian systemd[1]: Stopped Docker Application Container Engine.
Aug 30 09:21:52 debian systemd[1]: docker.service: Start request repeated too quickly.
Aug 30 09:21:52 debian systemd[1]: Failed to start Docker Application Container Engine.
Aug 30 09:21:52 debian systemd[1]: docker.service: Unit entered failed state.
Aug 30 09:21:52 debian systemd[1]: docker.service: Failed with result 'exit-code'.
Aug 30 09:22:00 debian systemd[1]: docker.service: Start request repeated too quickly.
Aug 30 09:22:00 debian systemd[1]: Failed to start Docker Application Container Engine.
Aug 30 09:22:00 debian systemd[1]: docker.service: Failed with result 'exit-code'.
I created a fresh Ubuntu 64bit VM on VirtualBox.
Then I used the install script to install docker: https://get.docker.com/
After the installation went successful I tried to configure the daemon to 10.0.2.15:2375 so I can forward it to my Host OS
I ran nano /etc/docker/daemon.json to create the file
I pasted following example into it
{
"debug": true,
"tls": false,
"tlscert": "/var/docker/server.pem",
"tlskey": "/var/docker/serverkey.pem",
"hosts": ["tcp://10.0.2.15:2375"]
}
then I ran service docker restart
running service docker status shows me the message above
Check the docker version of your machine by
docker --version
I was facing the same issue, and it got solved after upgrading the docker to latest version which is available.
Even the documentation available on docker's official website have not mentioned anything like that.
Once you upgrade docker , Restart the docker by
systemctl restart docker
The error will be gone, and new changes will start reflecting

docker not responding when using different data directory

I want to change the image directory in docker. I tried the initial two methods mentioned here. Both methods work and change the directory for docker images. But the problem is that the images stop responding. I can run the hello world example but if I run the ubuntu container or the whalesay container, docker stops responding and I can't run it again.
docker run -it ubuntu bash
docker run docker/whalesay cowsay boo
On using the above commands, the images get downloaded and nothing happens. Then I enter the command again to run and the system stops responding. I used Ctrl + C to terminate it but after that I can not open any other terminal screen. Also, the system doesn't power off; it gets stuck at a black screen. On force restarting the system docker starts failing to run giving the following log:
docker.service - Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Fri 2017-04-14 20:12:14 EDT; 10min ago
Docs: https://docs.docker.com
Process: 1160 ExecStart=/usr/bin/dockerd -H fd:// (code=exited, status=1/FAILURE)
Main PID: 1160 (code=exited, status=1/FAILURE)
Apr 14 20:12:14 abmittal-linux systemd[1]: Starting Docker Application Container Engine...
Apr 14 20:12:14 abmittal-linux dockerd[1160]: unable to configure the Docker daemon with file /etc/docker/daemon.json: EOF
Apr 14 20:12:14 abmittal-linux systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
Apr 14 20:12:14 abmittal-linux systemd[1]: Failed to start Docker Application Container Engine.
Apr 14 20:12:14 abmittal-linux systemd[1]: docker.service: Unit entered failed state.
Apr 14 20:12:14 abmittal-linux systemd[1]: docker.service: Failed with result 'exit-code'
Removing and reinstalling docker also doesn't work if the directory is same as before (even if the directory has been deleted and then made again). I have to change the directory in the configuration to get it to run again but again it stops responding.
The following is my daemon.json file:
{
"graph":"/mnt/other/docker_images"
}
EDIT: I think I may have found the error. The partition /mnt/other is using NTFS file system (and is on a different disk). Can someone please confirm if this might be the source of the error?
This is a known bug: Link
I tried creating a custom directory on an ext4 partition and it worked.

CoreOS Unit Failed on Launched

I tried launching a service using chat.service unit file on a CoreOS and it failed:
// chat.service
[Unit]
Description=ChatApp
[Service]
ExecStartPre=-/usr/bin/docker kill simplechat1
ExecStartPre=-/usr/bin/docker rm simplechat1
ExecStartPre=-/usr/bin/docker pull jochasinga/socketio-chat
ExecStart=/usr/bin/docker run -p 3000:3000 --name simplechat1 jochasinga/socketio-chat
fleetctl list-units shows:
UNIT MACHINE ACTIVE SUB
chat.service cfe13a03.../<virtual-ip> failed failed
However, if I changed the chat.service to just:
// chat.service
[Service]
ExecStart=/usr/bin/docker run -p 3000:3000 <mydockerhubuser>/socketio-chat
It ran just fine. fleetctl list-units shows:
UNIT MACHINE ACTIVE SUB
chat.service 8df7b42d.../<virtual-ip> active running
EDIT
Using journalctl -u chat.service I got:
Jun 02 00:02:47 core-01 systemd[1]: Started chat.service.
Jun 02 00:02:47 core-01 systemd[1]: chat.service: Main process exited, code=exited, status=125/n/a
Jun 02 00:02:47 core-01 docker[8924]: docker: Error response from daemon: failed to create endpoint clever_tesla on network brid
Jun 02 00:02:47 core-01 systemd[1]: chat.service: Unit entered failed state.
Jun 02 00:02:47 core-01 systemd[1]: chat.service: Failed with result 'exit-code'.
Jun 02 00:02:58 core-01 systemd[1]: Stopped chat.service.
Jun 02 00:03:08 core-01 systemd[1]: Stopped chat.service.
What had I done wrong in the first chat.service unit file? Any guidance is appreciated.
Running Vagrant version of CoreOS (stable) on Mac OS X.
Your ExecStartPre= command doesn't seem to have a docker subcommand in it. Did you mean to use pull?
Reading the journal for the unit should get you more information: journactl -u chat.service
After looking into the journal using #Rob suggestion and some research, it appears that docker couldn't create an endpoint at the port 3000 because on the OS there was a running docker process on that port.
Simply stop the process with docker stop <processname> and re-launching with fleetctl start chat solved the problem.

Resources