Failed to validate 'docker' driver (kubernetes) - docker

when Every time I try to minikube start on Linux (ubuntu 18.04), I always get this Docker validation errors.

This works fine for me:
myuser#mymachine:~$ minikube start --driver=docker
πŸ˜„ minikube v1.11.0 on Ubuntu 16.04
✨ Using the docker driver based on user configuration
πŸ‘ Starting control plane node minikube in cluster minikube
🚜 Pulling base image ...
πŸ”₯ Creating docker container (CPUs=2, Memory=2200MB) ...
🌐 Found network options:
β–ͺ NO_PROXY=169.254.169.254
🐳 Preparing Kubernetes v1.18.3 on Docker 19.03.2 ...
β–ͺ env NO_PROXY=169.254.169.254
β–ͺ kubeadm.pod-network-cidr=10.244.0.0/16
πŸ”Ž Verifying Kubernetes components...
🌟 Enabled addons: default-storageclass, storage-provisioner
πŸ„ Done! kubectl is now configured to use "minikube"
Make sure that /var/run/docker.sock has the right permission to be accessed by your user
myuser#mymachine:~$ sudo chmod o+rw /var/run/docker.sock
myuser#mymachine:~$ ls -la /var/run/docker.sock
srw-rw-rw- 1 root docker 0 Jul 6 17:42 /var/run/docker.sock
Make sure the docker daemon is running:
myuser#mymachine:~$ ps -Af | grep dockerd
root 12723 1 0 Jul06 ? 00:01:11 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
root 18598 17596 0 19:19 ? 00:00:05 /usr/bin/dockerd -H tcp://0.0.0.0:2376 -H unix:///var/run/docker.sock --default-ulimit=nofile=1048576:1048576 --tlsverify --tlscacert /etc/docker/ca.pem --tlscert /etc/docker/server.pem --tlskey /etc/docker/server-key.pem --label provider=docker --insecure-registry 10.96.0.0/12
adminra+ 31177 26444 0 19:36 pts/0 00:00:00 grep --color=auto dockerd

Related

Can nerdctl/crictl be used to list containers started by docker

I'm using version 20.10.21 of docker, in my understanding docker with this version uses containerd to manage image and container lifecycle, but why cannot I use crictl/nerdctl to list the containers which I started by docker cli?
What I've tried:
Check if docker uses containerd to manage contianers, ths is the result of systemctl status docker
docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; disabled; preset: disabled)
Drop-In: /etc/systemd/system/docker.service.d
└─http-proxy.conf
Active: active (running) since Sun 2022-12-04 22:44:27 CST; 1min 18s ago TriggeredBy: ● docker.socket
Docs: https://docs.docker.com Main PID: 1821 (dockerd)
Tasks: 91 (limit: 38297)
Memory: 229.6M
CPU: 1.214s
CGroup: /system.slice/docker.service
β”œβ”€1821 /usr/bin/dockerd -H fd://
β”œβ”€1845 containerd --config /var/run/docker/containerd/containerd.toml --log-level info
I guess this means containerd is started by docker daemon. And the unix socket is located at /var/run/docker/containerd/containerd.sock
Try nerdctl to list containers but got error message:
$ nerdctl --address unix:///var/run/docker/containerd/containerd.sock ps
FATA[0000] rootless containerd not running? (hint: use `containerd-rootless-setuptool.sh install` to start rootless containerd): stat /run/user/1000/containerd-rootless: no such file or directory
Then I tried it again with sudo
sudo nerdctl --address unix:///var/run/docker/containerd/containerd.sock ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
As you can see, there's no container listed, but docker ps shows many containers I started.
Try crictl to check result, but got errors:
sudo crictl --r unix:///var/run/docker/containerd/containerd.sock ps
E1204 22:47:27.190569 3925 remote_runtime.go:557] "ListContainers with filter from runtime service failed" err="rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService" filter="&ContainerFilter{Id:,State:&ContainerStateValue{State:CONTAINER_RUNNING,},PodSandboxId:,LabelSelector:map[string]string{},}"
FATA[0000] listing containers: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService
So my questions is: Why can't I get the same results of docker cli by nerdctl/crictl? Is there anything wrong I've done? or anything wrong in my understanding?
Thanks for any tips.

How to run crictl command as non root user

How to run crictl as non-root user.
My docker commands work with non-root user because my user is added to docker group.
id
uid=1002(kube) gid=100(users) groups=100(users),10(wheel),1001(dockerroot),1002(docker)
I am running dockerD daemon which uses containerd and runc as runtime.
I installed crictl binary and pointed it to connect to existing dockershim socket with config file as below.
cat /etc/crictl.yaml
runtime-endpoint: unix:///var/run/dockershim.sock
image-endpoint: unix:///var/run/dockershim.sock
timeout: 2
debug: false
pull-image-on-create: false
crictl works fine with sudo but without sudo it fails like this.
[user#hostname~]$ crictl ps
FATA[0002] connect: connect endpoint 'unix:///var/run/dockershim.sock', make sure you are running as root and the endpoint has been started: context deadline exceeded
I also tried to change group of dockershim.sock to 'docker' from 'root' just like docker.sock was to try, still same.
srwxr-xr-x 1 root docker 0 Jan 2 23:36 /var/run/dockershim.sock
srw-rw---- 1 root docker 0 Jan 2 23:33 /var/run/docker.sock
sudo usermod -aG docker $USER
or you can see docker postinstall

failed to start daemon: pid file found, ensure docker is not running or delete /var/run/docker.pid

I start docker
sudo service docker start
then I try to run dockerd
sudo dockerd
it shows the following error:
INFO[2021-11-21T19:25:52.804962676+05:30] Starting up
failed to start daemon: pid file found, ensure docker is not running or delete /var/run/docker.pid
it works for me:
sudo chmod 666 /var/run/docker.sock
Delete the PID file. Kill the running docker service and start it again.
ps -ef | grep docker
kill -9 <PIDs>
sudo systemctl start docker.service
Delete the .pid file using the below Linux command,
rm /var/run/docker.pid
Now the pid file will get deleted and the docker daemon can be launched newly.
I had the same problem. The following worked for me:
Deleted /var/run/docker.pid
Reboot computer
Had similar issue
`sudo docker ps -a`
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
sudo systemctl status docker
docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
Active: deactivating (stop-sigterm) since Wed 2022-09-07 09:32:11 -05; 5h 55min ago
Docs: https://docs.docker.com
Main PID: <PID_NO> (dockerd)
CGroup: /system.slice/docker.service
└─<PID_NO> /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
time="2022-09-07T09:32:26.-05:00" level=info msg="ccResol...=grpc
time="2022-09-07T09:32:26.-05:00" level=info msg="ClientC...=grpc
time="2022-09-07T09:32:26.-05:00" level=info msg="pickfir...=grpc
time="2022-09-07T09:32:26.-05:00" level=info msg="pickfir...=grpc
time="2022-09-07T09:32:26.-05:00" level=info msg="[graphd...lay2"
time="2022-09-07T09:32:26.-05:00" level=warning msg="moun...ound"
time="2022-09-07T09:32:26.-05:00" level=info msg="Loading...art."
systemd[1]: Dependency failed for Docker Application Container Engine.
systemd[1]: Job docker.service/start failed with result 'dependency'.
dockerd[<PID_NO>]: time="2022-09-07T09:39:52.-05:00" level=info msg="Process...ted'"
Hint: Some lines were ellipsized, use -l to show in full.
`sudo systemctl start docker` -- Gives no output
deleted docker.pid file in var/run but It didn't helped either

docker.socket: Failed with result 'service-start-limit-hit' after protecting docker daemon socket

I followed the steps provided in the documentation here to add tls security for docker api. Certificates are located in ~/.docker/ as well as /etc/docker/ssl/ folders. I added override.conf to /etc/systemd/system/docker.service.d/ with content
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd -H tcp://0.0.0.0:2376 --tlsverify --tlscacert=ca.pem --tlscert=server-cert.pem --tlskey=server-key.pem
Then, I used daemon-reload and docker start
$ systemctl daemon-reload
$ service docker start
The errors in journalctl -xe is:
-- Unit docker.socket has finished starting up.
--
-- The start-up result is RESULT.
Jan 15 21:43:24 cynicalplyaground systemd[1]: docker.service: Start request repeated too quickly.
Jan 15 21:43:24 cynicalplyaground systemd[1]: docker.service: Failed with result 'exit-code'.
Jan 15 21:43:24 cynicalplyaground systemd[1]: Failed to start Docker Application Container Engine.
-- Subject: Unit docker.service has failed
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- Unit docker.service has failed.
--
-- The result is RESULT.
Jan 15 21:43:24 cynicalplyaground systemd[1]: docker.socket: Failed with result 'service-start-limit-hit'.
Jan 15 21:45:01 cynicalplyaground CRON[12768]: pam_unix(cron:session): session opened for user root by (uid=0)
Jan 15 21:45:01 cynicalplyaground CRON[12769]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Jan 15 21:45:01 cynicalplyaground CRON[12768]: pam_unix(cron:session): session closed for user root
How can I sort this issue?
In the present case the same error occured after the latest manjaro update (2020-01-20).
Tried to change the systemd docker service, as adviced in other cases, but I reverted those changes and finally this was solved with:
a reboot of the system
(like advised here: https://www.reddit.com/r/archlinux/comments/7ya4ug/installing_docker_on_arch_linux/)
Getting to the root of the problem;
systemctl status docker.service
has this:
/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
Trying to run that command, it complains about
unable to configure the Docker daemon with file /etc/docker/daemon.json: EOF
ls -l /etc/docker/daemon.json
-rw-r--r-- 1 root root 0 Jul 30 10:32 /etc/docker/daemon.json
NOTE that the JSON file is empty. Delete it.
For me it was because the docker installer uses iptables for nat. Unfortunately Debian uses nftables. You can convert the entries over to nftables or just setup Debian to use the legacy iptables.
sudo update-alternatives --set iptables /usr/sbin/iptables-legacy
sudo update-alternatives --set ip6tables /usr/sbin/ip6tables-legacy
dockerd, should start fine after switching to iptables-legacy.
I have the same issue and just modify the "/usr/bin/dockerd" to "/usr/sbin/dockerd", then it works.
You can check the dockerd path first.
in my case... the host was part of a docker swarm...but the IPv6 was no longer reachable or automatically assigned to the host...
I manually add the old_IPv6
ip -6 address add 28xx:xxxx:x:x:xx:ebff:fe14:xxx dev ens3x
the journalctl -u docker.service mention:
level=fatal msg="Error starting cluster component: could not find local IP address: dial udp [2xxx:xxx:xxxx:xxx]:2377: connect: network is unreachable"
after add manually the IPv6 I was able to start docker so with docker running I leave the "swarm" and reboot
docker swarm leave --force
after reboot the docker services run as usual
For me it was missing disk space. Reboot also helped, but I was stillnot able to build any container.
After pruning some outdated stuff from the docker volumes I was able to continue.
I faced a similar issue on Ubuntu because I added the hosts option to /etc/docker/daemon.json file. That's ok, but for systems that use systemd it may cause conflict with the arguments passed to dockerd on start.
The solution was to delete the /etc/docker/daemon.json's hosts entry and set this config on file /etc/systemd/system/docker.service.d/options.conf.
$ cat /etc/systemd/system/docker.service.d/options.conf
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd -H tcp://0.0.0.0:2375 -H unix://
After that, restart the service.
$ sudo systemctl daemon-reload
$ sudo systemctl restart docker
You may check that your changes has been applied by running docker info. Also, you may note on the docker service status that Drop-In field is using the options.conf created, and dockerd was executed with the specified host list.
$ systemctl status docker
● docker.service - Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset>
Drop-In: /etc/systemd/system/docker.service.d
└─options.conf
Active: active (running) since Fri 2022-11-18 01:02:18 EST; 1h 50min ago
TriggeredBy: ● docker.socket
Docs: https://docs.docker.com
Main PID: 1111 (dockerd)
Tasks: 18
Memory: 58.5M
CPU: 1.294s
CGroup: /system.slice/docker.service
└─1111 /usr/bin/dockerd -H tcp://0.0.0.0:2375 -H unix://
References:
Daemon configuration file
Control Docker with systemd
I had a similar issue on nixOS installed in a btrfs filesystem.
For me the solution was to add virtualisation.docker.storageDriver = "btrfs"; to my /etc/nixos/configuration.nix
Which according to the docker docs should equate to adding the following to /etc/docker/daemon.json in most other distros:
{
"storage-driver": "btrfs"
}
I was able to solve the problem by disabling the firewalld
systemctl disable firewalld
systemctl stop firewalld

Docker remote api don't restart after my computer restart

Last week I struggled to make my docker remote api working. As it is running on VM, I have not restart my VM since then. Today I finally restarted my VM and it is not working any more (docker and docker-compose are working normally, but not docker remote api). My docker init file looks like this: /etc/init/docker.conf.
description "Docker daemon"
start on filesystem and started lxc-net
stop on runlevel [!2345]
respawn
script
/usr/bin/docker -H tcp://0.0.0.0:4243 -d
end script
# description "Docker daemon"
# start on (filesystem and net-device-up IFACE!=lo)
# stop on runlevel [!2345]
# limit nofile 524288 1048576
# limit nproc 524288 1048576
respawn
kill timeout 20
.....
.....
Last time I made setting indicated here this
I tried nmap to see if port 4243 is opened.
ubuntu#ubuntu:~$ nmap 0.0.0.0 -p-
Starting Nmap 7.01 ( https://nmap.org ) at 2016-10-12 23:49 CEST
Nmap scan report for 0.0.0.0
Host is up (0.000046s latency).
Not shown: 65531 closed ports
PORT STATE SERVICE
22/tcp open ssh
43978/tcp open unknown
44672/tcp open unknown
60366/tcp open unknown
Nmap done: 1 IP address (1 host up) scanned in 1.11 seconds
as you can see, the port 4232 is not opened.
when I run:
ubuntu#ubuntu:~$ echo -e "GET /images/json HTTP/1.0\r\n" | nc -U
This is nc from the netcat-openbsd package. An alternative nc is available
in the netcat-traditional package.
usage: nc [-46bCDdhjklnrStUuvZz] [-I length] [-i interval] [-O length]
[-P proxy_username] [-p source_port] [-q seconds] [-s source]
[-T toskeyword] [-V rtable] [-w timeout] [-X proxy_protocol]
[-x proxy_address[:port]] [destination] [port]
I run this also:
ubuntu#ubuntu:~$ sudo docker -H=tcp://0.0.0.0:4243 -d
flag provided but not defined: -d
See 'docker --help'.
I restart my computer many times and try a lot of things with no success.
I already have a group named docker and my user is in:
ubuntu#ubuntu:~$ groups $USER
ubuntu : ubuntu adm cdrom sudo dip plugdev lpadmin sambashare docker
Please tel me what is wrong.
Your startup script contains an invalid command:
/usr/bin/docker -H tcp://0.0.0.0:4243 -d
Instead you need something like:
/usr/bin/docker daemon -H tcp://0.0.0.0:4243
As of 1.12, this is now (but docker daemon will still work):
/usr/bin/dockerd -H tcp://0.0.0.0:4243
Please note that this is opening a port that gives remote root access without any password to your docker host.
Anyone that wants to take over your machine can run docker run -v /:/target -H your.ip:4243 busybox /bin/sh to get a root shell with your filesystem mounted at /target. If you'd like to secure your host, follow this guide to setting up TLS certificates.
I finally found www.ivankrizsan.se and it is working find now. Thanks to this guy (or girl) ;).
This settings work for me on ubuntu 16.04. Here is how to do :
Edit this file /lib/systemd/system/docker.service and replace the line ExecStart=/usr/bin/dockerd -H fd:// with
ExecStart=/usr/bin/docker daemon -H fd:// -H tcp://0.0.0.0:4243
Save the file
restart with :sudo service docker restart
Test with : curl http://localhost:4243/version
Result: you should see something like this:
{"Version":"1.11.0","ApiVersion":"1.23","GitCommit":"4dc5990","GoVersion" "go1.5.4","Os":"linux","Arch":"amd64","KernelVersion":"4.4.0-22-generic","BuildTime":"2016-04-13T18:38:59.968579007+00:00"}
Attention :
Remain aware that 0.0.0.0 is not good for security, for more security, you should use 127.0.0.1

Resources