Running docker-machine as a snap vs standard installation - docker

My question is is there any notable difference between running Docker-Machine as a snap vs built from a source ? I'm having networking issue and I suspect it may be related to the type of installation.
this is what I get when I try point active host to the Docker:
executing: docker-machine env instance
output:
Error checking TLS connection: Error checking and/or regenerating the certs: There was an error validating certificates for host
after executing: docker-machine -D regenerate-certs instance
output:
SSH cmd err, output: fork/exec /usr/bin/ssh: permission denied:
Error getting ssh command 'exit 0' : ssh command error:
command : exit 0
err : fork/exec /usr/bin/ssh: permission denied
Changing permissions for the mentioned directory didn't help.

Related

docker context not working on ssh remote server

I tried to connect on remote server with docker context
when tried docker ps from client(local mac) , I got this error message.
error during connect: Get "http://docker.example.com/v1.24/containers/json": command [ssh -l wwww -- tane-dev-0.ccn docker system dial-stdio] has exited with exit status 255, please make sure the URL is valid, and Docker 18.09 or later is installed on the remote host: stderr=wwww#xxx-dev-0.xxx: Permission denied (publickey,gssapi-keyex,gssapi-with-mic)
After googled about this problem, there's few things to checkup
docker version later 18.09 ✅
client: 20.10.14
remote: 20.10.17
register ssh key with ssh-agent ✅
ssh-add ~/.ssh/id_rsa
Identity added: /Users/ma_kyeongwook/.ssh/id_rsa (ma_kyeongwook#xxx.xxx)
ssh-add -l
4096 SHA256:~~~ ma_kyeongwook#xxx.xxx (RSA)
ssh connect test ✅
also checked cat ~/.ssh/authorized_keys contains client's pub key
Did i missed something?

Can I run k8s master INSIDE a docker container? Getting errors about k8s looking for host's kernel details

In a docker container I want to run k8s.
When I run kubeadm join ... or kubeadm init commands I see sometimes errors like
\"modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could
not open moddep file
'/lib/modules/3.10.0-1062.1.2.el7.x86_64/modules.dep.bin'.
nmodprobe:
FATAL: Module configs not found in directory
/lib/modules/3.10.0-1062.1.2.el7.x86_64",
err: exit status 1
because (I think) my container does not have the expected kernel header files.
I realise that the container reports its kernel based on the host that is running the container; and looking at k8s code I see
// getKernelConfigReader search kernel config file in a predefined list. Once the kernel config
// file is found it will read the configurations into a byte buffer and return. If the kernel
// config file is not found, it will try to load kernel config module and retry again.
func (k *KernelValidator) getKernelConfigReader() (io.Reader, error) {
possibePaths := []string{
"/proc/config.gz",
"/boot/config-" + k.kernelRelease,
"/usr/src/linux-" + k.kernelRelease + "/.config",
"/usr/src/linux/.config",
}
so I am bit confused what is simplest way to run k8s inside a container such that it consistently past this getting the kernel info.
I note that running docker run -it solita/centos-systemd:7 /bin/bash on a macOS host I see :
# uname -r
4.9.184-linuxkit
# ls -l /proc/config.gz
-r--r--r-- 1 root root 23834 Nov 20 16:40 /proc/config.gz
but running exact same on a Ubuntu VM I see :
# uname -r
4.4.0-142-generic
# ls -l /proc/config.gz
ls: cannot access /proc/config.gz
[Weirdly I don't see this FATAL: Module configs not found in directory error every time, but I guess that is a separate question!]
UPDATE 22/November/2019. I see now that k8s DOES run okay in a container. Real problem was weird/misleading logs. I have added an answer to clarify.
I do not believe that is possible given the nature of containers.
You should instead test your app in a docker container then deploy that image to k8s either in the cloud or locally using minikube.
Another solution is to run it under kind which uses docker driver instead of VirtualBox
https://kind.sigs.k8s.io/docs/user/quick-start/
It seems the FATAL error part was a bit misleading.
It was badly formatted by my test environment (all on one line.
When k8s was failing I saw the FATAL and assumed (incorrectly) that was root cause.
When I format the logs nicely I see ...
kubeadm join 172.17.0.2:6443 --token 21e8ab.1e1666a25fd37338 --discovery-token-unsafe-skip-ca-verification --experimental-control-plane --ignore-preflight-errors=all --node-name 172.17.0.3
[preflight] Running pre-flight checks
[WARNING FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables does not exist
[preflight] The system verification failed. Printing the output from the verification:
KERNEL_VERSION: 4.4.0-142-generic
DOCKER_VERSION: 18.09.3
OS: Linux
CGROUPS_CPU: enabled
CGROUPS_CPUACCT: enabled
CGROUPS_CPUSET: enabled
CGROUPS_DEVICES: enabled
CGROUPS_FREEZER: enabled
CGROUPS_MEMORY: enabled
[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 18.09.3. Latest validated version: 18.06
[WARNING SystemVerification]: failed to parse kernel config: unable to load kernel module: "configs", output: "modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could not open moddep file '/lib/modules/4.4.0-142-generic/modules.dep.bin'\nmodprobe: FATAL: Module configs not found in directory /lib/modules/4.4.0-142-generic\n", err: exit status 1
[discovery] Trying to connect to API Server "172.17.0.2:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://172.17.0.2:6443"
[discovery] Failed to request cluster info, will try again: [the server was unable to return a response in the time allotted, but may still be processing the request (get configmaps cluster-info)]
There are other errors later, which I originally though were a side-effect of the nasty looking FATAL error e.g. .... "[util/etcd] Attempt timed out"]} but I now think root cause is Etcd part times out sometimes.
Adding this answer in case someone else puzzled like I was.

Live migration of a jboss/wildfly container with CRIU failed

I've tried to live migrate a wildfly-container to another host like described here. The example with the np container works well. When I replace the example with a simple jboss/wildfly container, I just received this error when criu tries to restore the container on the other host :
Error response from daemon: Cannot restore container <CONTAINER-ID>: criu failed: type NOTIFY errno 0
Error: failed to restore one or more containers
Because I didn't found a solution to this error, I've compiled the linux kernel like described on the criu website and here.
After that sudo criu check prints:
Warn (criu/libnetlink.c:54): ERROR -2 reported by netlink
Warn (criu/libnetlink.c:54): ERROR -2 reported by netlink
Warn (criu/sockets.c:711): The current kernel doesn't support packet_diag
Warn (criu/libnetlink.c:54): ERROR -2 reported by netlink
Warn (criu/sockets.c:721): The current kernel doesn't support netlink_diag
Info prctl: PR_SET_MM_MAP_SIZE is not supported
Looks good.
criu --version
Version: 2.11
docker --version
Docker version 1.6.2, build 7c8fca2
Checkpoint/Restore for an example shell script example worked very well. But when I want to checkpoint a container
docker run -d --name looper busybox /bin/sh -c 'i=0; while true; do echo $i; i=$(expr $i + 1); sleep 1; done'
with
criu dump -t $PID --images-dir /tmp/looper
I receive this output
Error (criu/sockets.c:132): Diag module missing (-2)
Error (criu/sockets.c:132): Diag module missing (-2)
Error (criu/sockets.c:132): Diag module missing (-2)
Error (criu/mount.c:701): mnt: 87:./etc/hosts doesn't have a proper root mount
Error (criu/cr-dump.c:1641): Dumping FAILED.`
I can't find some solutions with these errors. Is there any known solution to live migrate a wildfly-container?
Thanks in advance

Docker-machine create with generic driver, Certificates not working but SSH does

Im trying to get a docker-machine up and running on a Ubuntu 14.04TSL server in our network. I have installed docker+docker-machine on the server and im able to create the docker-machine on the server with this command from my computer:
docker-machine create --driver generic --generic-ip-address 10.10.3.76 --generic-ssh-key "/Users/username/Documents/keys/mysshkey.pem" --generic-ssh-user ubuntuuser dockermachinename
The command above creates the docker-machine and im able to list it with
docker-machine ls
Im able to SSH to it by running
docker-machine ssh dockermachinename
but when i try to connect the server with (-D for debug information)
docker-machine -D env dockermachinename
I get the following message
Docker Machine Version: 0.5.2 ( 0456b9f )
Found binary path at /usr/local/bin/docker-machine-driver-generic
Launching plugin server for driver generic
Plugin server listening at address 127.0.0.1:54213
() Calling .GetVersion
Using API Version 1
() Calling .SetConfigRaw
() Calling .GetMachineName
(dockermachinename) Calling .GetState
(dockermachinename) Calling .GetURL
Reading CA certificate from /Users/username/.docker/machine/certs/ca.pem
Reading server certificate from /Users/username/.docker/machine/machines/dockermachinename/server.pem
Reading server key from /Users/username/.docker/machine/machines/dockermachinename/server-key.pem
Error checking TLS connection: Error checking and/or regenerating the certs: There was an error validating certificates for host "10.10.3.76:2376": dial tcp 10.10.3.76:2376: i/o timeout
You can attempt to regenerate them using 'docker-machine regenerate-certs [name]'.
Be advised that this will trigger a Docker daemon restart which will stop running containers.
I really need to solve this so all help is appreciated!
On Ubuntu you will need to do following steps:
1. Create user which don't require password
sudo visudo
at the end of file add following line (make sure to specify your username):
username ALL=(ALL:ALL) NOPASSWD: ALL
and then save and exit. And after that add your username to docker group like this (change username with your actual username):
usermod -aG docker username
2. Edit docker config to open 2375 and 2376 ports
sudo systemctl edit docker.service
add following snippet to that file:
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd -H fd:// -H tcp://0.0.0.0:2376 -H tcp://0.0.0.0:2375
then save and exit. After that reload config and restart docker deamon with:
sudo systemctl daemon-reload
sudo systemctl restart docker.service
3. Create docker-machine
Remove existing machine which is failing with:
docker-machine rm machine1
and try to create it one more time like this:
docker-machine create -d generic --generic-ip-address ip --generic-ssh-key ~/.ssh/key --generic-ssh-user username --generic-ssh-port 22 machine1
please change ip, key, username and machine1 with you actual values.
If this produce error like this:
Error checking TLS connection: Error checking and/or regenerating the certs: There was an error validating certificates for host "192.168.0.26:2376": tls: oversized record received with length 20527
You can attempt to regenerate them using 'docker-machine regenerate-certs [name]'.
Be advised that this will trigger a Docker daemon restart which might stop running containers.
then SSH to your machine and cd into following directory:
cd /etc/systemd/system/docker.service.d/
list all files in it with:
ls -l
you will probably have something like this:
-rw-r--r-- 1 root root 274 Jul 2 17:47 10-machine.conf
-rw-r--r-- 1 root root 101 Jul 2 17:46 override.conf
you will need to delete all files except 10-machine.conf with sudo rm.
After that remove machine you created and create it again. It should now work. I hope this helps. Maybe you already steps 1 and 2 if so then skip them and just try to remove override.conf file or any file in that dir which is not 10-machine.conf.

error while executing the following commands

When I run the following commands I am getting the below output:
sudo docker run ubuntu /bin/echo hello world
WARNING: WARNING: Local (127.0.0.1) DNS resolver found in resolv.conf and containers can't use it. Using default external servers : [8.8.8.8 8.8.4.4]
And when I run docker version, the output is:
mkdir /var/lib/docker/containers: permission denied[/var/lib/docker|a0f30ece] -job initserver() = ERR (1)
2014/03/03 21:49:51 initserver: mkdir /var/lib/docker/containers: permission denied
What is the problem?
My Problem solved by following :
Try modify the /etc/default/docker file, un-comment the OPTS line:
6 # Use DOCKER_OPTS to modify the daemon startup options.
7 #DOCKER_OPTS="-dns 8.8.8.8"

Resources