I come across this error when running an image on nvidia-docker. This image used to run well, but now it fails. The change in the device I'm made: cloned docker/compose, then removed docker/compose.
I manage to run nvidia-docker hello-world, but this does not use Nvidia drivers.
I replaced XXX instead of the full path.
running command:
nvidia-docker run --name my_test arg1 bash
Blockquote
docker: Error response from daemon: create nvidia_driver_410.48: found reference to volume 'nvidia_driver_410.48' in driver 'nvidia-docker', but got an error while checking the driver: error while checking if volume "nvidia_driver_410.48" exists in driver "nvidia-docker": Post http://XXXdocker%2Fplugins%2Fnvidia-docker.sock/VolumeDriver.Get: dial unix XXX/docker/plugins/nvidia-docker.sock: connect: connection refused: volume name must be unique.
Blockquote
Try to delete the old containers(including stopped containers), list and delete the nvidia docker volumes. Restart docker daemon if this does not solve the problem.
Else: try to purge/reinstall nvidia-container-toolkit and restart the docker daemon on top of the aforementioned step.
What I want to do:
I have dockerd running on one machine with TLS verify set to true. I would like to add this host as a machine in docker-machine
What I have done:
I used the following command to start dockerd:
$ sudo dockerd -D --tls=true --tlscert=cert.pem --tlskey=key.pem -H tcp://172.19.48.247:2376
On a second machine I sourced the following variables:
export DOCKER_HOST=tcp://172.19.48.247:2376
export DOCKER_TLS_VERIFY=1
export DOCKER_CERT_PATH=/path/to/ssl
and ran docker command succesfully:
$ docker run busybox echo hello
hello
Then I added this host docker-machine:
docker-machine create --driver none --url=tcp://172.19.48.247:2376 dockerhost
Where I am going wrong:
I am getting a x509: certificate signed by unknown authority error now.
$ docker-machine ls
NAME ACTIVE DRIVER STATE URL SWARM DOCKER ERRORS Unknown
dockerhost - none Running tcp://172.19.48.247:2376 Unknown Unable to query docker version: Get https://172.19.48.247:2376/v1.15/version: x509: certificate signed by unknown authority
I tried using the docker-machine config but that doesnt work:
$ docker-machine config dockerhost --tlsverify --tlscacert=ca.pem --tlscert=cert.pem --tlskey=key.pem -H tcp://172.19.48.247:2376
Incorrect Usage.
Usage: docker-machine config [OPTIONS] [arg...]
Print the connection config for machine
Description:
Argument is a machine name.
Options:
--swarm Display the Swarm config instead of the Docker daemon
flag provided but not defined: -tlsverify
By default, the none driver will be configured to use the TLS certs found at ~/.docker/machine. This isn't necessarily what is needed, because you'll run into the error you've run into if your remote Docker host has a certificate signed by something other than the ca.pem that you've got at that location.
I've found a reference to a workaround here that I tested and it definitely seems to work. Here are the steps I followed:
docker-machine create -d none --url tcp://remotedocker.example.com:2376 remotedocker
This creates the following directory:
~/.docker/machine/machines/remotedocker
Inside that directory is a file called config.json. Edit that file, and change every instance of ".docker/machine/certs" to ".docker/machine/machines/remotedocker"
Normally, when you access Docker remotely, it only needs to have access to the ca.pem, cert.pem and key.pem files. As far as I can tell, the other files referenced in config.json will likely not get used by the none driver because regenerate-certs is not implemented by none.
You will need to copy in the ca.pem and key.pem files
At this point, you should be able to run docker-machine config remotedocker, or eval "$(docker-machine env remotedocker)" and use your remote daemon successfully.
After (successfully, I believe) installing the Docker Toolbox, I get the following error:
$ docker ps
error during connect: Get http://%2F%2F.%2Fpipe%2Fdocker_engine/v1.26/containers/json: open //./pipe/docker_engine: The system cannot find the file specified. In the default daemon configuration on Windows, the docker client must be run elevated to connect. This error may also indicate that the docker daemon is not running.
Also, when I try to run the docker quickstart terminal, it just prints the following error:
Docker Machine is not installed. Please re-run the Toolbox Installer and try aga
in.
Looks like something went wrong in step 'Looking for vboxmanage.exe'... Press an
y key to continue...
I searched through the docker troubleshooting but didn't find any hint.
I tried installing the toolbox both with and without checking the "Install VirtualBox with NDIS5 driver [default NDIS6]" checkbox.
Try this,
Check if the docker machine exists. Command to check this below.
docker-machine ls
If you still get error then execute step 2 below. If you see any machine listed and it has STATE stopped then execute docker-machine start machine_name eg. docker-machine start default
This step helps you to create a docker-machine.
docker-machine create --driver virtualbox default
I also face some problems like you and I troubleshoot the problem with the following steps in window 8.1.
Install Docker Toolbox
REGENERATE CERTIFICATES
$ docker-machine regenerate-certs default
Regenerate TLS machine certs? Warning: this is irreversible. (y/n): y
Regenerating TLS certificates
Create new machine default
$ docker-machine create default
Note: It may take a few minutes for downloading boot2docker.iso file.
UPGRADE docker-machine (Optional)
$ docker-machine upgrade
START Docker Quickstart Terminal (or) Run start.sh file under the location => C:\Program Files\Docker Toolbox
RUN HELLOWORLD
$ docker run hello-world
I hope it will help you :-)
When I try to do docker run I get this:
docker: Cannot connect to the Docker daemon. Is the docker daemon running on this host?.
So I looked here https://github.com/docker/kitematic/issues/1010 and I tried this:
docker-machine env default
But I'm getting:
Error checking TLS connection: exit status 126
So I looked here https://github.com/docker/toolbox/issues/453 and I tried this:
docker-machine rm default
Now I'm getting:
Error removing host "default": exit status 126
So what is the issue and how can I solve it?
This issue could be caused by a few things:
Permissions - your user does not have the correct access rights to talk to the socket. Runner sudo usermod -aG docker YOUR-USER replacing YOUR-USER to the correct value. Note you will need to completely log out and log back in again for the changes to take effect
Your shell env is not set correctly with docker-machine - Each tab you open in terminal needs to connect to the correct machine if you are to use it correctly try running eval $(docker-machine env default) and then running some docker commands to see if that resolves your issues
Try regenerating the TLS certs for the machine and repeating step 2 - I noticed there was a TLS error. Sometimes the certs for connecting to the daemon can become invalid. Regenerate the certs by running docker-machine regenerate-certs default.
Update me with your progress and I'll be happy to help troubleshoot further.
Hope this helps
Dylan
Edit
Try creating a fresh docker machine with docker-machine create -d YOUR-PROVIDER YOUR-NAME and seeing if it is a machine specific issue
After I restarted my windows i cannot connect to docker machine running in Oracle Virtual Box.
When i start Docker QuickStart Terminal every thing looks fine, it's coming up OK and it gives me this message:
docker is configured to use the default machine with IP 192.168.99.100
For help getting started, check out the docs at https://docs.docker.com
but when i do:
$ docker-machine ls
NAME ACTIVE DRIVER STATE URL SWARM DOCKER ERRORS
default - virtualbox Timeout
and:
λ docker images
An error occurred trying to connect: Get http://localhost:2375/v1.21/images/json: dial tcp 127.0.0.1:2375: ConnectEx tcp: No connection could be made because the target machine actively refused it.
also when i try to reinitialize my env., i get:
λ docker-machine env default
Error checking TLS connection: Error checking and/or regenerating the certs: There was an error validating certificates for host "192.168.99.100:2376": dial tcp 192.168.99.100:2376: i/o timeout
You can attempt to regenerate them using 'docker-machine regenerate-certs [name]'.
Be advised that this will trigger a Docker daemon restart which will stop running containers.
BTW, Regenerating certs also not helping.
Any idea?
Thanks.
Please try regenerating certificates manually by:
docker-machine --debug regenerate-certs -f default
and check for any errors to fix, then try again:
docker-machine --debug env default
If it's failing on ssh, copy and paste that command into terminal to see what's the problem by adding extra -vv.
If you've got:
debug1: connect to address 127.0.0.1 port 64368: Connection refused
then your machine isn't running (check by docker-machine ls), so try:
docker-machine start
Then try to ssh to it via:
docker-machine -D ssh default
After doing some research I found out that following workaround may solve the issue for now:
Open Network And Sharing Center
Click on Change Adapter Setting
See if you have any enabled adapters such as VPN or VM Ware network adapters.
Try to disable them and try to connect to your container one more time
If it didn't work while you have other adapters disabled, Restart your PC - in my case this worked for me.
What worked for me is this answer from the docker-machine repo:
docker-machine regenerate-certs --client-certs [name]
Basically, what expired is client certificates. The error message I get from docker-machine is similar to yours (i.e., no indication it's the client certs that need to be regenerated).
I fix it doing this:
Removed all host-only interfaces from my VirtualBox (VirtualBox → Preferences → Network → Host-only networks).
rmdir.exe --ignore-fail-on-non-empty ~/.docker/
docker-machine start
docker-machine env
eval $("C:\Program Files\Docker Toolbox\docker-machine.exe" env default) (added also at the end of my .bash_profile).
docker run hello-world ← now working
Inspired in this post.
Here is what worked for me. The first steps are similar to what Hazhir proposed, then followed by regenerate the certificates.
Open Network And Sharing Center.
Click on Change Adapter Setting.
Disable all active VMWare network adapters. Usually has explanation "VirtualBox Host-Only Ethernet Adapter".
Connect to your container by running docker-machine start.
Run docker-machine env. If you're like me then you'd get following error:
Error checking TLS connection: Error checking and/or regenerating the
certs: There was an error validating certificates for host
"192.168.99.100:2376": x509: certificate is valid for 192.168.99.101,
not 192.168.99.100
Which is good. Now all we need to do is to run
docker-machine regenerate-certs -f default
Then test it again with docker-machine env. If you get:
SET DOCKER_TLS_VERIFY=1
SET DOCKER_HOST=tcp://192.168.99.100:2376
SET DOCKER_CERT_PATH=C:\Users\Jay\.docker\machine\machines\default
SET DOCKER_MACHINE_NAME=default
REM Run this command to configure your shell:
REM FOR /f "tokens=*" %i IN ('docker-machine env') DO %i
Then you're all set. In my case I needed to start my virtual machine by running Docker Quickstart Terminal.
I have this problem too. Execute docker-machine regenerate-certs <vm-name> can not solve problem. I search Google the error info and find the solution below.
execute sudo ifconfig vboxnet0 up in terminal.
show docker machine state: docker-machine ls.
now STATE and URL are ok.
But restart the system this problem persists.
GitHub issues link I found is here.
It seems there is a bug in VirtualBox 5.1.24.
Just start the docker machine and then regenerate certificates
docker-machine start <machine-name>
docker-machine regenerate-certs <machine-name>
It works like a charm for me.
None of the answers here helped me. My problem occurred when I want to activate the shell of my virtual machine with eval $(docker-machine env default).
It was then trying to access the port 2376 which was closed, so I had to enter the shell of the VM through ssh and activate the following UFW rule:
sudo ufw allow 2376
The way I ensure being able to connect to my docker machines is by assigning them a fixed IP (and regenerating the certs only once) (no reboot needed)
After that, docker-machine ls always work.
My current script:
(replace %PRGS%\dm\latest by the path where docker-machine.exe is on your machine)
(make sure PATH include the latest /path/to/git/usr/bin, for commands like ssh to be available)
> more dmvbf.bat
#echo off
setlocal enabledelayedexpansion
set machine=%1
if "%machine%" == "" (
echo dmvbf expects a machine name
exit /b 1
)
set ipx=%2
if "%ipx%" == "" (
echo dmvbf x missing ^(for 192.168.x.y^)
exit /b 2
)
set ipy=%3
if "%ipy%" == "" (
echo dmvbf y missing ^(for 192.168.x.y^)
exit /b 3
)
%PRGS%\dm\latest\docker-machine.exe ssh %machine% "sudo sh -c 'echo \"kill \$(more /var/run/udhcpc.eth1.pid)\" | sudo tee /var/lib/boot2docker/bootsync.sh >/dev/null'"
%PRGS%\dm\latest\docker-machine ssh %machine% "sudo sh -c 'echo \"ifconfig eth1 192.168.%ipx%.%ipy% netmask 255.255.255.0 broadcast 192.168.%ipx%.255 up\" | sudo tee -a /var/lib/boot2docker/bootsync.sh >/dev/null'"
%PRGS%\dm\latest\docker-machine ssh %machine% "sudo chmod 755 /var/lib/boot2docker/bootsync.sh"
%PRGS%\dm\latest\docker-machine ssh %machine% "sudo cat /var/run/udhcpc.eth1.pid | xargs sudo kill"
%PRGS%\dm\latest\docker-machine ssh %machine% "sudo ifconfig eth1 192.168.%ipx%.%ipy% netmask 255.255.255.0 broadcast 192.168.%ipx%.255 up"
For instance:
dmvbf default 99 100
docker-machine regenerate-certs -f default
That will assign 192.168.99.100 to the docker machine 'default', and regenerate the certs once.
Then each time docker-machine ls is called, it will display the same IP for 'default'.
Try this way/workaround:
firstly make sure there are ca.pem, cert.pem, key.pem, ca-key.pem under $yourhome/.docker/machine/certs/ folder , for these lost four *.pem files, you can copy them from other places or maybe create them yourselves ( these four pem files are surely not correct at the beginning )
make sure the env set correctly in bash_profile, like:
export DOCKER_HOST=tcp://192.168.99.100:2376
export DOCKER_MACHINE_NAME=default
export DOCKER_TLS_VERIFY=1
export DOCKER_CERT_PATH=/Users/johnwang/.docker/machine/machines/default
rerun the cmd: docker-machine regenerate-certs default (maybe before run this, you need reopen the docker terminal)
Tried on docker toolbox on mac, and it works.
Finally some logs of the result:
Error checking TLS connection: Error checking and/or regenerating the certs: There was an error validating certificates for host "192.168.99.100:2376": x509: certificate signed by unknown authority
You can attempt to regenerate them using 'docker-machine regenerate-certs [name]'.
Be advised that this will trigger a Docker daemon restart which might stop running containers.
...
...
johns-MacBook-Pro:certs johnwang$ docker-machine regenerate-certs default
Regenerate TLS machine certs? Warning: this is irreversible. (y/n): y
Regenerating TLS certificates
Waiting for SSH to be available...
Detecting the provisioner...
Copying certs to the local machine directory...
Copying certs to the remote machine...
Setting Docker configuration on the remote daemon...
johns-MacBook-Pro:certs johnwang$ docker-machine ls
NAME ACTIVE DRIVER STATE URL SWARM DOCKER ERRORS
default - virtualbox Running tcp://192.168.99.100:2376 v17.03.1-ce
Hope it helps
also see my response here:https://github.com/docker/machine/issues/2808
In my case it was my FortiClient that caused the issue. After disabling it docker-machine env default worked fine again. I suggest you to check if there's any anti-virus program running in your system.
for me, running
docker-machine --debug regenerate-certs -f name_of_your_vm
worked just fine.
docker-machine version 0.16.1
virtualBox 6.0
also docker was configured to use the default machine with IP 192.168.99.100
I had the same error. I fixed it by open tcp port 2376 in network firewall.
The solution for my problem is taken from here:
https://github.com/docker/machine/issues/3845#issuecomment-271935924
Quote:
If you install docker-machine first time then you do not have in that
host a self-signed CA that will be used to generate your client
certificate and as many server certificates as machines you generate
later on. That CA is generated when you try to create a machine if
that CA is not yet created. So if you try to generate several servers
in parallel (by means of an script), then you’ll generate as many
self-signed (root) CA as docker createcommands, all of them being
written in the same location that seems to be messing up the
environment e.g. spreading out different ca.pem to the remote machines
that do match the final version, causing the cert.pem (host identity)
to be signed by a former ca.pem which no longer exist… or whatever
other abnormal situation.
To fix it, first of all you'll need to delete your existing
self-signed CA. This can be done by removing the folder
~/.docker/machine/certs (NOTE: Note this will force the creation of a
new self-signed CA for docker-machine to use and will yield your
existing machines to fail connecting to the daemon). This will make
your docker-machine to generate valid certificates again. Then, for my
use case I am creating the first machine in foreground and all the
rest of them are done in parallel. That will cause the creation of one
root self-signed CA in isolation and then will be used for further
docker-machine create commands. It worked like a charm!
The reason why I was able to ssh to the host is because there are a
different pair of keys for sshing generate per host that was not
bitten by this.
To sum up, this is what I ended up doing:
Find out what is the command that docker-machine is running. I was using it with gitlab-runner, So I had to run gitlab-runner in debug mode to see what command was it running on docker-machine.
then stop gitlab-runner: gitlab-runner stop
then delete the certificate: rm -rf ~/.docker/machine/certs
then run a single command (from step #1) to re-create the certs (remember - the reason this didn't work is because it was trying to create it multiple times)
then rerun gitlab-runner: gitlab-runner start
Worked for me!
For reader using brew in 2021, after your somehow upgrade virtualbox cask
System Preferences... > Security & Privacy > (Unlock with finger) Allow.
<<Your Computer Should Restart>>.
docker-machine restart default. Done
Solved this issue in MacOS by installing Docker Desktop
brew uninstall docker
brew uninstall docker-machine
Then download Docker Desktop for mac https://docs.docker.com/desktop/mac/install/