Okay, this is very easy to reproduce and incredibly frustrating. Would be super grateful for any help or advice! I'm using Docker for Mac, running on OS X El Capitan (10.11.6). The gist is that Docker seems to not work with Google Compute Engine (GCE) via the Docker GCE driver (Docker official docs reference here).
1: Sign up for a new free GCP (Google Cloud) account at http://console.cloud.google.com/. Also download and install the Google Cloud SDK from here: https://cloud.google.com/sdk/.
2: Create a new Google Cloud project.
3: Go to "API Manager" in the Google Cloud console and click on "Credentials"
4: Click on "Create Credentials" and select "Service Account Key". Select "Compute Engine default service account", make sure JSON is selected as the output type, and click "Create". Move the outputted JSON file to your user root directory (/Users/MYUSERNAME).
5: Add the following line to your .bash_profile config:
export GOOGLE_APPLICATION_CREDENTIALS=/Users/MYUSERNAME/NAME_OF_CREDENTIALS_FILE.json. Save the file.
6: Exit the terminal and open up a new one so that the env variable is now set.
7: Run gcloud config set project PROJECT_ID (where PROJECT_ID is the name of the project just created in the Google Cloud Console).
8: Run gcloud auth login which will open a browser tab to log you into Google and grant permissions. Click 'Allow'.
9: Now the fun part, run the following command, per the Docker documentation (I've added a --debug flag):
docker-machine --debug create --driver google --google-project PROJECT_ID vm01
('vm01' is the name of the virtual machine here, this could be anything you want.)
At the end of the very lengthy output I get the following, concluded by the error message at the very bottom:
(LOTS OF OTHER OUTPUT BEFORE THIS, NOT ABLE TO COPY-PASTE EVERYTHING DUE TO STACKOVERFLOW 30000 CHAR LIMIT)
(vm01) Calling .GetURL
(vm01) Calling .DriverName
Setting Docker configuration on the remote daemon...
(vm01) Calling .GetSSHHostname
(vm01) Calling .GetSSHPort
(vm01) Calling .GetSSHKeyPath
(vm01) Calling .GetSSHKeyPath
(vm01) Calling .GetSSHUsername
Using SSH client type: external
Using SSH private key: /Users/nathan/.docker/machine/machines/vm01/id_rsa (-rw-------)
&{[-F /dev/null -o PasswordAuthentication=no -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=quiet -o ConnectionAttempts=3 -o ConnectTimeout=10 -o ControlMaster=no -o ControlPath=none docker-user#104.198.166.134 -o IdentitiesOnly=yes -i /Users/nathan/.docker/machine/machines/vm01/id_rsa -p 22] /usr/bin/ssh <nil>}
About to run SSH command:
printf %s "[Service]
ExecStart=/usr/bin/docker daemon -H tcp://0.0.0.0:2376 -H unix:///var/run/docker.sock --storage-driver aufs --tlsverify --tlscacert /etc/docker/ca.pem --tlscert /etc/docker/server.pem --tlskey /etc/docker/server-key.pem --label provider=google
MountFlags=slave
LimitNOFILE=1048576
LimitNPROC=1048576
LimitCORE=infinity
Environment=
[Install]
WantedBy=multi-user.target
" | sudo tee /etc/systemd/system/docker.service
SSH cmd err, output: <nil>: [Service]
ExecStart=/usr/bin/docker daemon -H tcp://0.0.0.0:2376 -H unix:///var/run/docker.sock --storage-driver aufs --tlsverify --tlscacert /etc/docker/ca.pem --tlscert /etc/docker/server.pem --tlskey /etc/docker/server-key.pem --label provider=google
MountFlags=slave
LimitNOFILE=1048576
LimitNPROC=1048576
LimitCORE=infinity
Environment=
[Install]
WantedBy=multi-user.target
(vm01) Calling .GetSSHHostname
(vm01) Calling .GetSSHPort
(vm01) Calling .GetSSHKeyPath
(vm01) Calling .GetSSHKeyPath
(vm01) Calling .GetSSHUsername
Using SSH client type: external
Using SSH private key: /Users/nathan/.docker/machine/machines/vm01/id_rsa (-rw-------)
&{[-F /dev/null -o PasswordAuthentication=no -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=quiet -o ConnectionAttempts=3 -o ConnectTimeout=10 -o ControlMaster=no -o ControlPath=none docker-user#104.198.166.134 -o IdentitiesOnly=yes -i /Users/nathan/.docker/machine/machines/vm01/id_rsa -p 22] /usr/bin/ssh <nil>}
About to run SSH command:
sudo systemctl daemon-reload
SSH cmd err, output: <nil>:
(vm01) Calling .GetSSHHostname
(vm01) Calling .GetSSHPort
(vm01) Calling .GetSSHKeyPath
(vm01) Calling .GetSSHKeyPath
(vm01) Calling .GetSSHUsername
Using SSH client type: external
Using SSH private key: /Users/nathan/.docker/machine/machines/vm01/id_rsa (-rw-------)
&{[-F /dev/null -o PasswordAuthentication=no -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=quiet -o ConnectionAttempts=3 -o ConnectTimeout=10 -o ControlMaster=no -o ControlPath=none docker-user#104.198.166.134 -o IdentitiesOnly=yes -i /Users/nathan/.docker/machine/machines/vm01/id_rsa -p 22] /usr/bin/ssh <nil>}
About to run SSH command:
sudo systemctl -f start docker
SSH cmd err, output: <nil>:
(vm01) Calling .GetSSHHostname
(vm01) Calling .GetSSHPort
(vm01) Calling .GetSSHKeyPath
(vm01) Calling .GetSSHKeyPath
(vm01) Calling .GetSSHUsername
Using SSH client type: external
Using SSH private key: /Users/nathan/.docker/machine/machines/vm01/id_rsa (-rw-------)
&{[-F /dev/null -o PasswordAuthentication=no -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=quiet -o ConnectionAttempts=3 -o ConnectTimeout=10 -o ControlMaster=no -o ControlPath=none docker-user#104.198.166.134 -o IdentitiesOnly=yes -i /Users/nathan/.docker/machine/machines/vm01/id_rsa -p 22] /usr/bin/ssh <nil>}
About to run SSH command:
netstat -tln
SSH cmd err, output: <nil>: Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 10.0.3.1:53 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
tcp6 0 0 :::22 :::* LISTEN
Error creating machine: Error running provisioning: Unable to verify the Docker daemon is listening: Maximum number of retries (10) exceeded
notifying bugsnag: [Error creating machine: Error running provisioning: Unable to verify the Docker daemon is listening: Maximum number of retries (10) exceeded]
Solved this just now. I used an updated image from the Google registry (Ubuntu 16.04 LTS, versus the default Ubuntu 15 that gets used by the docker-machine --driver google command) and it seems to have worked properly. Not sure why. The full command was:
docker-machine --debug create --driver google --google-project PROJECT_ID --google-machine-image https://www.googleapis.com/compute/v1/projects/ubuntu-os-cloud/global/images/ubuntu-1604-xenial-v20161205 vm02
Related
Sharing a solution to this issue in case it's helpful to somebody, as the exact issue / fix doesn't seem to be covered by other threads with similar titles.
The symptom was that on attempting to create a new vm with docker-machine create --driver hyperv testvm, the process hung at:
Running pre-create checks...
Creating machine...
(testvm) Copying F:\Virtual\Docker\cache\boot2docker.iso to
F:\Virtual\Docker\machines\testvm\boot2docker.iso...
(testvm) Creating SSH key...
(testvm) Creating VM...
(testvm) Using switch "Docker External Switch"
(testvm) Creating VHD
(testvm) Starting VM...
(testvm) Waiting for host to start...
Waiting for machine to be running, this may take a few minutes...
Detecting operating system of created instance...
Waiting for SSH to be available...
This was with Windows 10 Pro, Hyper-V, and a fresh install of Docker Desktop Community 2.0.0.3 (although I suspect that Hyper-V is irrelevant to this issue).
When I ctrl-c'd out of the create command I could docker-machine ls and see that the VM was up, but was showing an error:
NAME ACTIVE DRIVER STATE URL SWARM DOCKER ERRORS
testvm - hyperv Running tcp://192.168.5.61:2376 Unknown Unable to query docker version: Get https://192.168.5.60:2376/v1.15/version: x509: certificate signed by unknown authority
All attempts to docker-machine ssh to it failed similarly:
PS C:\> docker-machine ssh testvm
exit status 255
I tried using git bash as suggested in various threads elsewhere, but was seeing, eg:
$ docker-machine ssh testvm
Error: Cannot run SSH command: Host "testvm" is not running
(Likely some kind of configuration issue with my git bash install, but was unable to figure out what it was!)
The problem turned out to be some kind of compatibility issue with my installation of OpenSSH here:
PS C:\> get-command ssh
CommandType Name Version Source
----------- ---- ------- ------
Application ssh.exe 7.7.2.1 C:\Windows\System32\OpenSSH\ssh.exe
This was producing debug output (when docker-machine was run with the -debug switch) along these lines:
(testvm) Calling .GetSSHPort
(testvm) Calling .GetSSHKeyPath
(testvm) Calling .GetSSHKeyPath
(testvm) Calling .GetSSHUsername
Using SSH client type: external
&{[-F /dev/null -o ConnectionAttempts=3 -o ConnectTimeout=10 -o ControlMaster=no -o ControlPath=none -o LogLevel=quiet -o PasswordAuthentication=no -o ServerAliveInterval=60 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null docker#192.168.5.61 -o IdentitiesOnly=yes -i F:\Virtual\Docker\machines\testvm\id_rsa -p 22] C:\Windows\System32\OpenSSH\ssh.exe <nil>}
About to run SSH command:
exit 0
SSH cmd err, output: exit status 255:
Error getting ssh command 'exit 0' : ssh command error:
command : exit 0
err : exit status 255
output :
Everything started to work when I used the --native-ssh switch which is documented here. I was then able to:
docker-machine --native-ssh regenerate-certs testvm
..to resolve the certificate issue, and:
PS C:\> docker-machine --native-ssh ssh testvm ps
PID TTY TIME CMD
3301 pts/0 00:00:00 ps
..etc.
Probably better though to:
docker-machine rm -y testvm
docker-machine --native-ssh create --driver hyperv testvm
Everything was working for me without the switch at one point - my guess is that I didn't have OpenSSH installed at that time, and docker-machine was using its native version by default.
I wanted to run a docker container at work following these instructions:
https://docs.docker.com/machine/drivers/hyper-v/#environment-variables-and-default-values
when I run docker-machine -D create -d hyperv --hyperv-virtual-switch "minikube" --hyperv-cpu-count "1" --hyperv-memory "1024" --hyperv-disk-size "20000" worker4
docker is stuck and repeats this step further and further:
Waiting for SSH to be available...
Getting to WaitForSSH function...
(worker4) Calling .GetSSHHostname
(worker4) DBG | [executing ==>] : C:\WINDOWS\System32\WindowsPowerShell\v1.0\powershell.exe -NoProfile -NonInteractive ( Get-VM worker4 ).state
(worker4) DBG | [stdout =====>] : Running
(worker4) DBG |
(worker4) DBG | [stderr =====>] :
(worker4) DBG | [executing ==>] : C:\WINDOWS\System32\WindowsPowerShell\v1.0\powershell.exe -NoProfile -NonInteractive (( Get-VM worker4 ).networkadapters[0]).ipaddresses[0]
(worker4) DBG | [stdout =====>] : fe80::215:5dff:fe0a:2b3d
(worker4) DBG |
(worker4) DBG | [stderr =====>] :
(worker4) Calling .GetSSHPort
(worker4) Calling .GetSSHKeyPath
(worker4) Calling .GetSSHKeyPath
(worker4) Calling .GetSSHUsername
Using SSH client type: external
&{[-F /dev/null -o PasswordAuthentication=no -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=quiet -o ConnectionAttempts=3 -o ConnectTimeout=10 -o ControlMaster=no -o ControlPath=none docker#fe80::215:5dff:fe0a:2b3d -o IdentitiesOnly=yes -i C:\Use
rs\account\.docker\machine\machines\worker4\id_rsa -p 22] C:\Program Files\Git\usr\bin\ssh.exe <nil>}
I tried the same steps at home (both systems with Windows 10) and it succeed. After comparing the logs I found out, that I got a local ipv4 adress at home. We are using only ipv4 at work so I am confused why I got an ipv6 address. Could this be the error why it got stuck?
Update:
After I removed git bash and run the command again, I got this error:
Error dialing TCP: dial tcp [fe80::215:5dff:fe0a:2b47]:22: connectex: Ein Socketvorgang bezog sich auf ein nicht verfügbares Netzwerk.
According to Microsoft the Network is unreachable. Can it be that my network card is miss configured or maybe that the router has some problems?
The problem was that the ip adresses from our network are static ones.
Create minikube in an dynamic network
minikube stop
Connect your pc to your static network and give the minikube vm a mac address that can has a connected ip address
minikube start
I am getting the error below when issuing minikube start (minikube start --vm-driver=virtualbox --v=7) command:
Waiting for SSH to be available...
Getting to WaitForSSH function...
Using SSH client type: external
Using SSH private key: /root/.minikube/machines/minikube/id_rsa (-rw-------)
&{[-F /dev/null -o PasswordAuthentication=no -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=quiet -o ConnectionAttempts=3 -o ConnectTimeout=10 -o ControlMaster=no -o ControlPath=none docker#127.0.0.1 -o IdentitiesOnly=yes -i /root/.minikube/machines/minikube/id_rsa -p 22] /usr/bin/ssh <nil>}
About to run SSH command:
exit 0
SSH cmd err, output: exit status 255:
Error getting ssh command 'exit 0' : ssh command error:
command : exit 0
err : exit status 255
output :
When researching the above log lines i have noticed the ssh command isn't targeting the minikube virtual machine IP but 127.0.0.1. If manually run the ssh command to 127.0.0.1 i get a permission denied error.
/usr/bin/ssh -o PasswordAuthentication=no -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o ConnectionAttempts=3 -o ConnectTimeout=10 -o ControlMaster=no -o ControlPath=none docker#127.0.0.1 -o IdentitiesOnly=yes -i /root/.minikube/machines/minikube/id_rsa -p 22
Warning: Permanently added '127.0.0.1' (ECDSA) to the list of known hosts.
Permission denied (publickey,password).
shouldn't the script connect to the minikube IP other than 127.0.0.1? here is the output from vboxmanage showvminfo
/usr/bin/VBoxManage showvminfo minikube | grep NIC
NIC 1: MAC: 08002790443F, Attachment: NAT, Cable connected: on, Trace: off (file: none), Type: 82540EM, Reported speed: 0 Mbps, Boot priority: 0, Promisc Policy: deny, Bandwidth group: none
NIC 1 Settings: MTU: 0, Socket (send: 64, receive: 64), TCP Window (send:64, receive: 64)
NIC 1 Rule(0): name = ssh, protocol = tcp, host ip = 127.0.0.1, host port = 37549, guest ip = , guest port = 22
NIC 2: MAC: 08002790D54C, Attachment: Host-only Interface 'vboxnet0', Cable connected: on, Trace: off (file: none), Type: 82540EM, Reported speed: 0 Mbps, Boot priority: 0, Promisc Policy: deny, Bandwidth group: none
My system layout is as follows:
Vmwareplayer 6.0.5 build-2443746, hypervisor config enabled.
Ubuntu 17.04
virtualbox 5.1.22
minikube version: v0.21.0
kubectl version 1.7.0
thanks in advance
#eslimasec, minikube ssh always use port forward to access vm:
NIC 1 Rule(0): name = ssh, protocol = tcp, host ip = 127.0.0.1, host port = 37549, guest ip = , guest port = 22
when you ssh to 127.0.0.1:37549 will forward to vm:22
so when you test ssh toward minikube vm, should use port 37549 instead of 22,
/usr/bin/ssh -o PasswordAuthentication=no -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o ConnectionAttempts=3 -o ConnectTimeout=10 -o ControlMaster=no -o ControlPath=none docker#127.0.0.1 -o IdentitiesOnly=yes -i /root/.minikube/machines/minikube/id_rsa -p **37549**
and this is also root cause in your minikube start.
Hope it is helpful.
I am using docker for testing my playbooks.
I created a container now when i am running below command inside container its giving me below error
ansible-playbook jenkins.yml
Error:-
[root#db1e9105692d jenkins-playbook]# ansible-playbook jenkins.yml -k -vvv
SSH password:
PLAY [localhost] **************************************************************
GATHERING FACTS ***************************************************************
<localhost> ESTABLISH CONNECTION FOR USER: root
<localhost> REMOTE_MODULE setup
<localhost> EXEC sshpass -d4 ssh -C -tt -v -o ControlMaster=auto -o ControlPersist=60s -o ControlPath="/root/.ansible/cp/ansible-ssh-%h-%p-%r" -o StrictHostKeyChecking=no -o GSSAPIAuthentication=no -o PubkeyAuthentication=no -o ConnectTimeout=10 localhost /bin/sh -c 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1454580537.38-114451000565344 && echo $HOME/.ansible/tmp/ansible-tmp-1454580537.38-114451000565344'
EXEC previous known host file not found for localhost
fatal: [localhost] => SSH Error: ssh: connect to host localhost port 22: Connection refused
while connecting to 127.0.0.1:22
It is sometimes useful to re-run the command using -vvvv, which prints SSH debug output to help diagnose the issue.
TASK: [jenkins | Include OS-Specific variables] *******************************
<localhost> ESTABLISH CONNECTION FOR USER: root
fatal: [localhost] => One or more undefined variables: 'ansible_os_family' is undefined
FATAL: all hosts have already failed -- aborting
PLAY RECAP ********************************************************************
to retry, use: --limit #/root/jenkins.retry
localhost : ok=0 changed=0 unreachable=2 failed=0
But if i run this command on host machine its running fine.Do i need to do anything so that connection do not gets refused on port 22.inside docker container
Please do not consider below line as reason for error.Its just that ansible has executed few more lines before throwing error. Actually its not able to run so thats why value of this variable is empty.
fatal: [localhost] => One or more undefined variables: 'ansible_os_family' is undefined
In your container start your playbook locally:
$ ansible-playbook jenkins.yml -c local -k -vvv
Do you have connection=local defined for localhost? It's trying to connect via ssh, which can not work because you probably do not have sshd running in your container.
I am running Docker on my OS X local host. I created a dev' machine, and I am trying to run a command upon SSH to my dev machine :
~$ docker-machine ssh dev -- ssh -o 'StrictHostKeyChecking no' \
-i /Users/yves/.docker/machine/machines/dev/id_rsa \
-N -L 5000:localhost:5000 root#harbor.dufour16.net &
I get:
[1] 28171
~$ exit status 255
Then I don't get any prompt back. I need to use CTRL-C , and I get:
[1]+ Exit 1 docker-machine ssh dev -- ssh -o 'StrictHostKeyChecking
no' -i /Users/yves/.docker/machine/machines/dev/id_rsa
-N -L 5000:localhost:5000 root#harbor.dufour16.net
Is there a way to execute it correctly ? (Using boot2docker it was easier as the command to be executed was quoted). Thanks for feedback.
You should be able to just use quotes i.e:
ssh dev "ssh -o 'StrictHostKeyChecking no' \
-i /Users/yves/.docker/machine/machines/dev/id_rsa \
-N -L 5000:localhost:5000 root#harbor.dufour16.net &"