Google Cloud Logging Driver cannot find credentials after reboot - docker

I've followed the directions here, and everything works well until I restart my computer. After restarting, it seems like the docker daemon loses track of the Google credentials.
$ docker run --log-driver=gcplogs ...
fails with:
docker: Error response from daemon: failed to initialize logging driver: google: could not find default credentials. See https://developers.google.com/accounts/docs/application-default-credentials for more information.
ERRO[0000] error waiting for container: context canceled
This is strange to me, because running $ systemctl show --property=Environment docker returns the value in my systemd configuration:
Environment=GOOGLE_APPLICATION_CREDENTIALS=/etc/path/to/application_default_credentials.json
If I $ sudo systemctl restart docker, then docker runs sucessfully and logs are sent to stackdriver. But I want this docker image to run automatically on startup, and restarting docker with sudo gets in the way.
Is there a way to initialize the docker daemon with the necessary environment variables, so gcplogs is ready on boot without restarting docker?

I had two versions of docker installed -- one through adding docker's repo to apt, and one through snap. Running
sudo systemctl list-unit-files| grep docker | grep enabled
showed two installations of docker:
docker.service enabled
snap.docker.dockerd.service enabled
Having two docker installations was causing problems for startup. I removed the snap installation, rebooted, and everything now works.

I think you may try to edit the systemd: Unit dependencies and order, let docker.service start after google-accounts-daemon.service.
You can see all the service in google vm by
sudo systemctl list-unit-files| grep google | grep enabled
And you will see
google-accounts-daemon.service enabled
google-clock-skew-daemon.service enabled
google-instance-setup.service enabled
google-network-daemon.service enabled
google-shutdown-scripts.service enabled
google-startup-scripts.service enabled

Related

Restart docker daemon in rootless mode (on Linux)

How can I restart docker daemon running in rootless mode on Linux?
Stopping it works fine with:
docker --user stop docker.service
but starting it back again fails when using:
docker --user start docker.service
The command doesn't return anything but when checking the docker info it says:
ERROR: Cannot connect to the Docker daemon at unix:///run/user/1000/docker.sock. Is the docker daemon running?
It doesn't give any further information...
I had this error a couple of times before, when I accidentally run docker with sudo and therefore got mixed up permissions in my data-root (defined in daemon.json). But this time chowning it back to $USER didn't help with the restart. Also restarting the host machine didn't help (as it did a couple of times previously).
Ok, it seems that "userns-remap" is not compatible with rootless mode:
Rootless mode executes the Docker daemon and containers inside a user namespace. This is very similar to userns-remap mode, except that with userns-remap mode, the daemon itself is running with root privileges, whereas in rootless mode, both the daemon and the container are running without root privileges. Rootless mode does not use binaries with SETUID bits or file capabilities, except newuidmap and newgidmap, which are needed to allow multiple UIDs/GIDs to be used in the user namespace.
I was trying to fix permission issues on shared volumes by experimenting with setting UIDs/GIDs and added "userns-remap" to the ~/.config/docker/daemon.json:
{
"data-root": "/home/me/docker/image-storage",
"userns-remap": "me"
}
So deleting userns-remap from the config file fixed the restarting issue... Man, docker, at least a hint to the config file would be great... Because the userns-remap option was mentioned on some official docker doc pages I didn't even consider it as the source of the trouble in the first place.

Error response from daemon: Cannot kill container: permission denied, how to kill docker containers on Ubuntu 20.04?

I'm trying to kill a docker container, but I got permission denied. I use Ubuntu 20.04, my docker version for client is 20.10.7 and the one for the server is 20.10.11.
This is the log I got:
Error response from daemon: Cannot kill container: fastapi_server: permission denied
I read that I should use this comand for restarting docker.
sudo systemctl restart docker.socket docker.service
But the thing is that when I execute this command, all my containers and images dissapear, but If I try on localhost:8000 my port is occupied by the container that I wanted to delete. And if I run sudo netstat -anp | grep 8000, I get:
tcp 0 0 0.0.0.0:8000 0.0.0.0:* LISTEN 2493/docker-proxy
tcp6 0 0 :::8000 :::* LISTEN 2500/docker-proxy
So this confirms that my port is already taken by a docker container, but when I run docker ps -a, I get no container. I also tried docker kill, but it did not work.
How should I kill this container & get my 8000 port free?
Please think twice before removing AppArmor. To my understanding this is central to application security for instance on recent major Ubuntu versions.
It seems the rights problem is specific to a Docker version. Assuming yours is also installed via snap, please attempt upgrading your Docker version to at least the current beta, e.g. with
snap refresh docker --beta
20.10.12 seems to work fine.
(In fact I fell for the suggestion and did remove my AppArmor - snaps went away. Then reinstalled ASAP, the settings of relevant snaps are still with me - afterwards installed docker back, had the problem, upgraded it: seems to work like a charm.)
It appeared that I had installed docker with snap as well as using the docker repository:
sudo snap list
So:
sudo snap remove docker --purge
sudo aa-remove-unknown
Along with re-installing Docker using the method described here solved my issues! No need to disable or remove apparmor.
Try these steps:
docker inspect
Find the PID AND kill that process.
If that does not work check with
dmesg
everything related to Docker. You can put output here that we can help you.
Ok,from you png ist seems that you have problem with AppArmor. Try this:
sudo apt purge --auto-remove apparmor
sudo service docker restart
docker system prune --all --volumes
what works for me in these cases:
sudo systemctl restart docker.socket docker.service
sudo docker image rm -f $(sudo docker image ls -q)
I installed Docker from snap and experienced the permission denied error response. After reading many users experiencing more problems with the apparmor suggestion, I uninstalled Docker from snap, then used digitalocean's Docker installation tutorial.
It worked for me, posting here as reference for others experiencing the same problem.
In my case it was also apparmor on Ubuntu 20.04 after upgrade from Bionic. By running dmesg I got error message:
[1113458.482007] audit: type=1400 audit(1672134271.112:1718): apparmor="DENIED" operation="signal" profile="docker-default" pid=1654 comm="dockerd" requested_mask="receive" denied_mask="receive" signal=kill peer="snap.docker.dockerd
To fix this please edit /etc/apparmor.d/docker and add to the beginning (however, after the 'profile docker-default .... {' ) the following line:
signal,
Then reload apparmor
sudo systemctl reload apparmor
This fixed it at least on my computer.
See more https://manpages.ubuntu.com/manpages/xenial/man5/apparmor.d.5.html under section signal:
Example AppArmor signal rules:
# Allow all signal access
signal,

Docker daemon (not containers) can't read environment variables

Trying to configure a container running outside of GCP to log to Google Cloud Platform (StackDriver). One requirement is that the Docker daemon is able to locate the environment variable GOOGLE_APPLICATION_CREDENTIALS so it can authenticate. One would assume that the following would work, but it doesn't:
GOOGLE_APPLICATION_CREDENTIALS=/usr/local/keys/project-1.json docker run --log-driver=gcplogs ...
That outputs:
ERROR: for api Cannot start service api:
failed to initialize logging driver: google: could not find default credentials.
See https://developers.google.com/accounts/docs/application-default-credentials
for more information.
Haven't found any documentation on how to set that directly on daemon.json, but I don't want that either because I might have different containers logging to different GCP projects.
I've tried this on Mac (docker desktop) and Debian.
This is question that keeps coming back. What is happening here is that environment variable GOOGLE_APPLICATION_CREDENTIALS is loaded by the system docker daemon. System daemons don't see the environment variables set in the user login. What you need to do is set the GOOGLE_APPLICATION_CREDENTIALS at the system level.
Here is how to do that in Ubuntu(Systemd):
$ sudo mkdir -p /etc/systemd/system/docker.service.d
Create /etc/systemd/system/docker.service.d/env.conf with the following content:
[Service]
Environment="GOOGLE_APPLICATION_CREDENTIALS=/path/to/file.json"
Apply the changes.
$ sudo systemctl daemon-reload
Once done restart docker/containerd daemons
$ sudo systemctl restart containerd
$ sudo systemctl restart docker
Test the gcplogs driver
docker run --log-driver=gcplogs --log-opt gcp-project="my-project" hello-world

How to use SystemD for running a docker command before docker is stopped in Docker v.18.09.3

In my use-case, I want to backup a Docker volume using SystemD before docker-daemon is stopped.
I got a working version using Docker 17.03.2. The SystemD service is defined as follows:
[Unit]
Description=Backup some Docker volume
Requires=network-online.target docker.service
After=docker.service
[Service]
Type=oneshot
ExecStop=/bin/sh /var/dobackup.sh
RemainAfterExit=yes
[Install]
WantedBy=multi-user.target
The content of /var/dobackup.sh is not that important here. It includes a docker command, which uses a given Docker volume and does a tar on it.
It might be important, that I am running this in the Google cloud compute engine, in a container optimized OS. In special, in milestone 69 (, which includes Docker v17.03.2).
Updating to Docker v18.09.3
Now, I want to update to Docker v18.09.3 (in special, I am updating the running OS to Container Optimized OS milestone 73).
The service from above does not work any more. I get the following error, when my docker-command in /var/dobackup.sh is running:
docker: Error response from daemon: all SubConns are in
TransientFailure, latest connection error: connection error: desc =
"transport: Error while dialing dial
unix:///var/run/containerd/containerd.sock: timeout": unavailable.
The problem is obviously in ContainerD not being available any more. I tried
Requires=network-online.target containerd.service docker.service
without success.
How can I adapt my service to Docker v18.09?
Some users have reported similar issues with containerd.server in previous Docker versions (17.12 and 18.03). The workaround applied was:
killall -9 dockerd
sudo service docker restart
In this link is mentioned a similar error and how users sorted it out the problem after restarting docker service.
Likely this is caused by a containerd/docker integration issue fixed in cos-73-11647-192-0. Could you try it on cos-73-11647-192-0?

Network timed out while trying to connect to https://index.docker.io

I installed Docker-Toolbox just now while following their webpage
I started with Docker QuickStart Terminal and see following
## .
## ## ## ==
## ## ## ## ## ===
/"""""""""""""""""\___/ ===
~~~ {~~ ~~~~ ~~~ ~~~~ ~~~ ~ / ===- ~~~
\______ o __/
\ \ __/
\____\_______/
docker is configured to use the default machine with IP 192.168.99.100
For help getting started, check out the docs at https://docs.docker.com
bash-3.2$
But when I try to perform docker pull hello-world, this is what I see
bash-3.2$ docker run hello-world
Unable to find image 'hello-world:latest' locally
Pulling repository docker.io/library/hello-world
Network timed out while trying to connect to https://index.docker.io/v1/repositories/library/hello-world/images. You may want to check your internet connection or if you are behind a proxy.
bash-3.2$
What's wrong?
I had the same problem this morning and the following fixed it for me:
$ docker-machine restart default # Restart the environment
$ eval $(docker-machine env default) # Refresh your environment settings
It appears that this is due to the Docker virtual machine getting itself into a strange state. There is an open github issue here
I installed Docker without the Toolbox on Windows 10, so the version that requires Hyper-V to be enabled.
For Docker version 1.12 I had to go into the taskbar, right click the Docker Icon, select Settings -> Network and set the DNS Server to fixed, so that is uses Google's DNS server at 8.8.8.8.
Once that setting was changed, it finally worked.
The simpler solution is to add the following entry in /etc/default/docker file
export http_proxy="http://HOST:PORT/"
and restart the docker service
service docker restart
Update August 2016
Using Docker for Mac (version 1.12.0), was seeing issues of the form:
➜ docker pull node
Using default tag: latest
Pulling repository docker.io/library/node
Network timed out while trying to connect to https://index.docker.io/v1/repositories/library/node/images. You may want to check your internet connection or if you are behind a proxy.`enter code here`
This was resolved by updating my MacBook Pro wireless network settings to include the following DNS entry: 8.8.8.8
For further info, please see this (dated) issue which provided the answer given here.
I ran into this problem running Docker on my MAC(host) with Docker VM in VBOX 5.10. It is a networking issue. The simple fix is to add a bridged network to the VBOX image. You can use the included NAT config present with the VM, but you need to change the ssh port from 50375 to 2375.
sudo service docker stop
sudo service docker start
works for me..
somehow, sudo service docker restart didn't work
(RHEL7)
On Windows 7 and if you believe you are behind proxy
Logon to default machine
$ docker-machine ssh default
Update profile to update proxy settings
docker#default:~$ sudo vi /var/lib/boot2docker/profile
Append from the below as appropriate
# replace with your office's proxy environment
export"HTTP_PROXY=http://PROXY:PORT"
export"HTTPS_PROXY=http://PROXY:PORT"
# you can add more no_proxy with your environment.
export"NO_PROXY=192.168.99.*,*.local,169.254/16,*.example.com,192.168.59.*"
Exit
docker#default:~$ exit
Restart docker machine
docker-machine restart default
Update environment settings
eval $(docker-machine env default)
Above steps are slightly tweaked but as given in troubleshooting guide: https://docs.docker.com/toolbox/faqs/troubleshoot/#/update-varlibboot2dockerprofile-on-the-docker-machine
I ran into this exact same problem yesterday and none of the "popular" answers (like fixing DNS to 8.8.8.8) worked for me. I eventually happened across this link, and that did the trick ... https://github.com/docker/for-win/issues/16
Between Docker for Windows, Windows 10 and Hyper-V, there seems to be a problem during the virtual network adapter creation process. Specifically, you might end up with two "vEthernet (DockerNAT)" network adapters. Check this with Get-NetAdapter "vEthernet (DockerNAT)" (in an elevated PowerShell console). If the result shows more than one adapter, you can disable and rename it with:
$vmNetAdapter = Get-VMNetworkAdapter -ManagementOS -SwitchName DockerNAT
Get-NetAdapter "vEthernet (DockerNAT)" | ? { $_.DeviceID -ne $vmNetAdapter.DeviceID } | Disable-NetAdapter -Confirm:$False -PassThru | Rename-NetAdapter -NewName "OLD"
Then open up Device Manager and delete the disabled adapter (for some reason you can do this from here, but not from the Network and Sharing Center adapters view).
I assume that you have a network problem. Are you behind a proxy? Is it possible that it filters the connection to docker.io or blocks the docker user agent?
I installed the toolbox and ran your test. It works fine, here:
docker is configured to use the default machine with IP 192.168.99.101
For help getting started, check out the docs at https://docs.docker.com
bash-3.2$ docker run hello-world
Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
535020c3e8ad: Pull complete
af340544ed62: Already exists
library/hello-world:latest: The image you are pulling has been verified. Important: image verification is a tech preview feature and should not be relied on to provide security.
Digest: sha256:d5fbd996e6562438f7ea5389d7da867fe58e04d581810e230df4cc073271ea52
Status: Downloaded newer image for hello-world:latest
Hello from Docker.
This message shows that your installation appears to be working correctly.
To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
3. The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it
to your terminal.
To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash
Share images, automate workflows, and more with a free Docker Hub account:
https://hub.docker.com
For more examples and ideas, visit:
https://docs.docker.com/userguide/
bash-3.2$
On Windows 10. Just right-click on the systray docker icon-> Settings... -> Rest -> Restrart Docker
I had this same problem with boot2docker and fixed it by restarting it with:
boot2docker restart
I just ran into this today with 1.10.1 and none of the existing solutions worked. I tried to restart, upgrade, regenerate certs, ...
I noticed that I had a lot of networks created on the machine. After removing them with:
docker network ls | grep bridge | awk '{print $1}' | xargs -n1 docker network rm
The DNS started working again.
Note: You may ignore errors about pre-defined networks
If you are behind proxy it is not enough to set HTTP_PROXY and HTTPS_PROXY env. You should set it while machine creation.
Paramer for this is --engine-env:
docker-machine create -d "virtualbox" --engine-env HTTP_PROXY=http://<PROXY>:<PORT> --engine-env HTTPS_PROXY=<PROXY>:<PORT> dev
In my case, installing docker on Alpine Linux I get the error:
Network timed out while trying to connect to https://index.docker.io/v1/repositories/library/........
Using the script here:
https://github.com/docker/docker/blob/master/contrib/download-frozen-image-v2.sh
Works. It downloads the image using curl and then shows you how to untar and 'docker load' it.
I tried the above methods of static DNS at 8.8.8.8 and disabling ipv6 (I didn't understand the proxy thing) and none of them worked for me.
EDIT 9/8/2016:
I was initially using dropbear instead of openssh. Reinstalled Alpine with openssh fixed the problem.
The next problem was 'ApplyLayer exit status 1 stdout: stderr: chmod /bin/mount: permission denied' error during pull.
From (nixaid.com/grsec-in-docker/):
To build the Docker image, I had to disable the following grsec
protections. Modify the /etc/sysctl.d/grsec.conf as follows:
kernel.grsecurity.chroot_deny_chmod = 0
kernel.grsecurity.chroot_deny_mknod = 0
kernel.grsecurity.chroot_caps = 0 # related to a systemd package/CAP_SETFCAP
in alpine's case though it's
/etc/sysctl.d/00-alpine.conf
reboot
Restarting Docker or recreating the image did not help. I rebooted Windows to no avail.
Astoundingly, when I ssh'ed into the running container and did curl https://index.docker.io/v1/repositories/library/hello-world/images I got a perfectly valid response.
I used the Docker Toolbox with VirtualBox on 64bit Windows 10 Pro.
The solution in my case was to uninstall the old Docker version and install the new one that uses Hyper-V instead of VirtualBox.
Now Docker works again.
If you are behind proxy kindly use below commands
sudo mkdir /etc/systemd/system/docker.service.d
sudo cd /etc/systemd/system/docker.service.d
sudo vi http-proxy.conf
[Service]
Environment=HTTP_PROXY=http://proxy-server-ip:port" "NO_PROXY=localhost,127.0.0.1"
sudo systemctl daemon-reload
sudo systemctl show --property=Environment docker
sudo systemctl restart docker
Try this if you can fetch latest ubuntu
sudo docker run -it ubuntu bash
Unable to find image ubuntu:latest locally
latest: Pulling from library/ubuntu b3e1c725a85f: Pull complete
4daad8bdde31: Pull complete
63fe8c0068a8: Pull complete
4a70713c436f: Pull complete
bd842a2105a8: Pull complete
Digest:
sha256:7a64bc9c8843b0a8c8b8a7e4715b7615e4e1b0d8ca3c7e7a76ec8250899c397a
Status: Downloaded newer image for ubuntu:latest
It worked for me finally :)
Another scenario: if your docker network adapter is disabled, it will fail with this error. The adapter is named "vEthernet (DockerNAT)" or similar. Apparently this adapter is involved somehow in the normal docker pull behavior. Enable it back to solve the problem.
Create a systemd drop-in directory for the docker service:
$ sudo mkdir -p /etc/systemd/system/docker.service.d
Create a file called /etc/systemd/system/docker.service.d/http-proxy.conf that adds the HTTP_PROXY environment variable:
[Service]
Environment="HTTP_PROXY=http://proxy.example.com:80/"
Hope it helps
refer to https://docs.docker.com/network/proxy/
for me, proxy setting without http:// or https:// prefix works.
e.g:
PROXY:PORT
or with / suffix with http:// or https:// prefix
e.:
http://PROXY:PORT/
On Windows this happened when I moved from a work network to a home network.
To solve it, run:
docker-machine stop
docker-machine start
docker-env
"C:\Program Files\Docker Toolbox\docker-machine.exe" env | Invoke-Expression

Resources