Running docker image hangs and then times out in the prod server - docker

I have a docker image that I can run properly in my local VM. Everything runs fine.
I save the image and load it in the prod server.
I can see the image by using docker images
next I try to run it with docker run -p 9191:9191 myservice
It hangs and eventually times out.
The log shows the following:
time="2018-08-15T16:14:35.058232400-07:00" level=debug msg="HCSShim::CreateContainer succeeded id=b928213e42b2103cd33b676ed08a15529be10fffcfd7e709af86df5db79e4152 handle=
39044064"
time="2018-08-15T16:14:35.058232400-07:00" level=debug msg="libcontainerd: Create() id=b928213e42b2103cd33b676ed08a15529be10fffcfd7e709af86df5db79e4152, Calling start()"
time="2018-08-15T16:14:35.058232400-07:00" level=debug msg="libcontainerd: starting container b928213e42b2103cd33b676ed08a15529be10fffcfd7e709af86df5db79e4152"
time="2018-08-15T16:14:35.058232400-07:00" level=debug msg="HCSShim::Container::Start id=b928213e42b2103cd33b676ed08a15529be10fffcfd7e709af86df5db79e4152"
time="2018-08-15T16:18:25.393050900-07:00" level=debug msg="Result: {\"Error\":-2147023436,\"ErrorMessage\":\"This operation returned because the timeout period expired.\
"}"
time="2018-08-15T16:18:25.394050800-07:00" level=error msg="libcontainerd: failed to start container: container b928213e42b2103cd33b676ed08a15529be10fffcfd7e709af86df5db7
9e4152 encountered an error during Start: failure in a Windows system call: This operation returned because the timeout period expired. (0x5b4)"
time="2018-08-15T16:18:25.394050800-07:00" level=debug msg="HCSShim::Container::Terminate id=b928213e42b2103cd33b676ed08a15529be10fffcfd7e709af86df5db79e4152"
time="2018-08-15T16:18:25.394050800-07:00" level=debug msg="libcontainerd: cleaned up after failed Start by calling Terminate"
time="2018-08-15T16:18:25.394050800-07:00" level=error msg="Create container failed with error: container b928213e42b2103cd33b676ed08a15529be10fffcfd7e709af86df5db79e4152
encountered an error during Start: failure in a Windows system call: This operation returned because the timeout period expired. (0x5b4)"
time="2018-08-15T16:18:25.424053800-07:00" level=debug msg="attach: stdout: end"
time="2018-08-15T16:18:25.425055000-07:00" level=debug msg="attach: stderr: end"
time="2018-08-15T16:18:25.427054100-07:00" level=debug msg="Revoking external connectivity on endpoint boring_babbage (b20f403df0ed25ede9152f77eb0f8e049677f1279b68862a25b
b9e2ab94babfb)"
time="2018-08-15T16:18:25.459087300-07:00" level=debug msg="[DELETE]=>[/endpoints/31e66619-5b57-47f2-9256-bbba54510e3b] Request : "
time="2018-08-15T16:18:25.548068700-07:00" level=debug msg="Releasing addresses for endpoint boring_babbage's interface on network nat"
time="2018-08-15T16:18:25.548068700-07:00" level=debug msg="ReleaseAddress(172.25.224.0/20, 172.25.229.142)"
time="2018-08-15T16:18:25.561064000-07:00" level=debug msg="WindowsGraphDriver Put() id b928213e42b2103cd33b676ed08a15529be10fffcfd7e709af86df5db79e4152"
time="2018-08-15T16:18:25.561064000-07:00" level=debug msg="hcsshim::UnprepareLayer flavour 1 layerId b928213e42b2103cd33b676ed08a15529be10fffcfd7e709af86df5db79e4152"
time="2018-08-15T16:18:25.566074800-07:00" level=debug msg="hcsshim::UnprepareLayer succeeded flavour 1 layerId=b928213e42b2103cd33b676ed08a15529be10fffcfd7e709af86df5db7
9e4152"
time="2018-08-15T16:18:25.566074800-07:00" level=debug msg="hcsshim::DeactivateLayer Flavour 1 ID b928213e42b2103cd33b676ed08a15529be10fffcfd7e709af86df5db79e4152"
time="2018-08-15T16:18:25.668075600-07:00" level=debug msg="hcsshim::DeactivateLayer succeeded flavour=1 id=b928213e42b2103cd33b676ed08a15529be10fffcfd7e709af86df5db79e41
52"
I can see when it is trying to create the container and then it fails.
But why?
added more information
I finally found out how to check the server status for running container and I am getting this error message:
So it means the server doesn't have a network gateway?
How can I fix this problem?
Still keep looking
More information
I did delete all NAT and create a new one, so the online check passed now.
However, I still encounter other errors and can't run the image.
Something in the virtual network is wrong, I just can't find the right information to fix it.... :(

Related

Docker Swarm not scheduling containers when node dies

EDIT
NEVEMIND THIS QUESTION. I found that one of my services, which is using a Docker.DotNet, was terminating the services marked as Shutdown. I've corrected the bug and have regained my trust in Docker and Docker Swarm.
Thank you Carlos for you help. My bad, my fault. Sorry for that!
I have 13 services configured on a docker-compose file and running in Swarm mode with one manager and two worker nodes.
Then I make one of the worker nodes unavailable by draining it
docker node update --availability drain ****-v3-6by7ddst
What I notice is that all the services that where running on the drained node are removed and not scheduled to the available node.
The available worker and manager nodes still have plenty of resources.. The services are simply removed.I am now down to 9 services
Looking at the logs I see stuff like bellow but repeated with different service ids
level=warning msg="Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap."
level=error msg="Error getting service u68b1fofzb3nefpnasctpywav: service u68b1fofzb3nefpnasctpywav not found"
level=warning msg="rmServiceBinding 021eda460c5744fd4d499475e5aa0f1cfbe5df479e5b21389ed1b501a93b47e1 possible transient state ok:false entries:0 set:false "
Then, for debug purposes I set my node back to available
docker node update --availability active ****-v3-6by7ddst
Then I try to balance some of the services to the newly available node. And this is the result.
I get the same error on the logs
level=error msg="Error getting service ****_frontend: service ****_frontend not found"
level=warning msg="rmServiceBinding 6bb220c0a95b30cdb3ff7b577c7e9dec7ad6383b34aff85e1685e94e7486e3ea possible transient state ok:false entries:0 set:false "
msg="Error getting service l29wlucttul75pzqo2sgr0u9e: service l29wlucttul75pzqo2sgr0u9e not found"
On my docker-compose file I am configuring all my services like this. Restart policy is any.
frontend:
image: {FRONTEND_IMAGE}
deploy:
labels:
- "traefik.enable=true"
- "traefik.docker.lbswarm=true"
- "traefik.http.routers.frontend.rule=Host(`${FRONTEND_HOST}`)"
- "traefik.http.routers.frontend.entrypoints=websecure"
- "traefik.http.routers.frontend.tls.certresolver=myhttpchallenge"
- "traefik.http.services.frontend.loadbalancer.server.port=80"
- "traefik.docker.network=ingress"
replicas: 1
resources:
limits:
memory: ${FRONTEND_LIMITS_MEMORY}
cpus: ${FRONTEND_LIMITS_CPUS}
reservations:
memory: ${FRONTEND_RESERVATION_MEMORY}
cpus: ${FRONTEND_RESERVATION_CPUS}
restart_policy:
condition: any
networks:
- ingress
Something fails while recreating services on different nodes, and even with only one manager/worker node I get the same result.
The rest seems to work fine. As an example, if I scale a service it works well.
New Edit
Just did another test.
This time I only have two services, traefik and front-end.
One instance for traefik
4 instances for front-end
two nodes (one manager and one worker)
Drained worker node and front-end instances running on the drained node are moved to the manager node
Activated back the worker node
Did a docker service update cords_frontend --force and two instances of front-end are killed on the manager node and are placed running on the worker node.
So, with this test with only two services everything works fine.
Is there any kind of limit to the number of services and stack should have?
Any clues why this is happening?
Thanks
Hugo
I believe you may be running into an issue with resource reservations. You mention that the nodes available have plenty of resources, but the way reservations work, a service will not be scheduled if it can't reserve the resources specified, very important to note that this has nothing to do with how much resources the service is actually using. This means that if you specify a reservation you are basically saying that service will reserve that amount of resources and those resources are not available for other services to use. So if all your services have similar reservations you may be running into a situation where even though the node shows available resources, those resources are in fact reserved by the existing service. So I would suggest you remove the reservations section and try it to see if that is in fact what is happening.
So I am still struggling with this.
As a recap, I have on docker stack with 13 services running in swarm mode with two nodes (manager+worker). Each node has 4 cores and 8GB of RAM (Ubuntu 18.04, docker 19.03.12).
If a node dies, or I drain a node, all the services running on this node die and are marked are Removed. If I simply run docker service update front_end --force the service also dies and is marked as removed.
Another important detail is, if I sum up all the reserved memory and cores from the 13 services I end up with 1.9 cores and 4GB of RAM, way bellow each of the nodes resources.
I don't see any out of memory on the containers, services or stack logs. Also, by using htop tool I can see that memory usage is using 647MB/7.79GB on Manager node and 2GB/7.79GB on the worker node.
This is what i tried so far:
separated the 13 services into two different stacks. No luck.
removed all the reservations tags from the compose files. No luck.
tried running with 3 nodes. No luck.
I was seeing this warning WARNING: No swap limit support so I followed the suggestions on this document on both nodes enter link description here. No luck.
Upped both node resources to 8 cores and 16BG of RAM. No luck.
tried starting each service one at the time, and I noticed it starts behaving badly with 10 or more services. That is to say, everything works fine if I have up to 9 services running, after this I see the behaviour described above.
Also, I enabled docker's debug mode to see what was happening. Here are the outputs.
If I run docker service update front_end --force and front_end service dies, this is the output form docker events
service update k6a7go4uhexb4b1u1fp98dtke (name=frontend)
service update k6a7go4uhexb4b1u1fp98dtke (name=frontend, updatestate.new=updating)
service remove k6a7go4uhexb4b1u1fp98dtke (name=frontend)
logs from journalctl -fu docker.service
level=debug msg="Calling GET /v1.40/services/frontend?insertDefaults=false"
level=debug msg="error handling rpc" error="rpc error: code = NotFound desc = service frontend not found" rpc=/docker.swarmkit.v1.Control/GetService
level=debug msg="form data: {\"EndpointSpec\":{\"Mode\":\"vip\"},\"Labels\":{\"com.docker.stack.image\":\"registry.gitlab.com/devteam/.frontend:1.0.5\",\"com.docker.stack.namespace\":\"\",\"traefik.docker.lbswarm\":\"true\",\"traefik.docker.network\":\"net\",\"traefik.enable\":\"true\",\"traefik.http.routers.frontend.entrypoints\":\"websecure\",\"traefik.http.routers.frontend.rule\":\"Host(`www.frontend.website`)\",\"traefik.http.routers.frontend.tls.certresolver\":\"myhttpchallenge\",\"traefik.http.services.frontend.loadbalancer.server.port\":\"80\"},\"Mode\":{\"Replicated\":{\"Replicas\":1}},\"Name\":\"frontend\",\"TaskTemplate\":{\"ContainerSpec\":{\"Image\":\"registry.gitlab.com/fdevteam/frontend:1.0.5#sha256:e9a0d88bc14848c3b40c3d2905842313bbc648c1bbf09305f8935f9eb23f289a\",\"Isolation\":\"default\",\"Labels\":{\"com.docker.stack.namespace\":\"f\"},\"Privileges\":{\"CredentialSpec\":null,\"SELinuxContext\":null}},\"ForceUpdate\":1,\"Networks\":[{\"Aliases\":[\"frontend\"],\"Target\":\"w7aqg3stebnmk5c5pbhgslh2d\"}],\"Placement\":{\"Platforms\":[{\"Architecture\":\"amd64\",\"OS\":\"linux\"}]},\"Resources\":{},\"RestartPolicy\":{\"Condition\":\"any\",\"MaxAttempts\":0},\"Runtime\":\"container\"}}"
level=debug msg="Calling GET /v1.40/services/frontend?insertDefaults=false"
level=debug msg="error handling rpc" error="rpc error: code = NotFound desc = service frontend not found" rpc=/docker.swarmkit.v1.Control/GetService
...
level=debug msg="Calling GET /v1.40/tasks?filters=%7B%22_up-to-date%22%3A%7B%22true%22%3Atrue%7D%2C%22service%22%3A%7B%22frontend%22%3Atrue%7D%7D"
level=debug msg="error handling rpc" error="rpc error: code = NotFound desc = service frontend not found" rpc=/docker.swarmkit.v1.Control/GetService
level=debug msg="handleEpTableEvent UPD 6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745 R:{frontend.1.lv7tjjaev45pvn0f7qtppb21r frontend nnlg81dsspnj6oxip4iqwwjc3 10.0.1.73 10.0.1.74 [] [frontend] [e661c9f39097] true}"
level=debug msg="rmServiceBinding from handleEpTableEvent START for frontend 6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745 p:0xc004a1f880 nid:w7aqg3stebnmk5c5pbhgslh2d sKey:{nnlg81dsspnj6oxip4iqwwjc3 } deleteSvc:true"
level=debug msg="deleteEndpointNameResolution 6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745 frontend rm_service:false suppress:false sAliases:[frontend] tAliases:[e661c9f39097]"
level=debug msg="delContainerNameResolution 6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745 frontend.1.lv7tjjaev45pvn0f7qtppb21r"
level=debug msg="6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745 (w7aqg3s).deleteSvcRecords(frontend.1.lv7tjjaev45pvn0f7qtppb21r, 10.0.1.74, <nil>, true) rmServiceBinding sid:6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745 "
level=debug msg="6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745 (w7aqg3s).deleteSvcRecords(tasks.frontend, 10.0.1.74, <nil>, false) rmServiceBinding sid:nnlg81dsspnj6oxip4iqwwjc3 "
level=debug msg="rmServiceBinding from handleEpTableEvent END for frontend 6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745"
level=debug msg="Calling GET /v1.40/services/frontend?insertDefaults=false"
level=debug msg="error handling rpc" error="rpc error: code = NotFound desc = service frontend not found" rpc=/docker.swarmkit.v1.Control/GetService
level=error msg="Error getting service frontend: service frontend not found"
level=debug msg="handleEpTableEvent DEL 6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745 R:{frontend.1.lv7tjjaev45pvn0f7qtppb21r frontend nnlg81dsspnj6oxip4iqwwjc3 10.0.1.73 10.0.1.74 [] [frontend] [e661c9f39097] true}"
level=debug msg="rmServiceBinding from handleEpTableEvent START for frontend 6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745 p:0xc004a1f880 nid:w7aqg3stebnmk5c5pbhgslh2d sKey:{nnlg81dsspnj6oxip4iqwwjc3 } deleteSvc:true"
level=debug msg="deleteEndpointNameResolution 6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745 frontend rm_service:true suppress:false sAliases:[frontend] tAliases:[e661c9f39097]"
level=debug msg="delContainerNameResolution 6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745 frontend.1.lv7tjjaev45pvn0f7qtppb21r"
level=debug msg="6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745 (w7aqg3s).deleteSvcRecords(frontend.1.lv7tjjaev45pvn0f7qtppb21r, 10.0.1.74, <nil>, true) rmServiceBinding sid:6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745 "
level=debug msg="6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745 (w7aqg3s).deleteSvcRecords(tasks.frontend, 10.0.1.74, <nil>, false) rmServiceBinding sid:nnlg81dsspnj6oxip4iqwwjc3 "
level=debug msg="6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745 (w7aqg3s).deleteSvcRecords(frontend, 10.0.1.73, <nil>, false) rmServiceBinding sid:nnlg81dsspnj6oxip4iqwwjc3 "
level=debug msg="rmServiceBinding from handleEpTableEvent END for frontend 6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745"
If the service does not die (that is the case with 9 or less services) this is the ouput:
service update n1wh16ru879699cpv3topcanc (name=frontend)
service update n1wh16ru879699cpv3topcanc (name=frontend, updatestate.new=updating)
service update n1wh16ru879699cpv3topcanc (name=frontend, updatestate.new=completed, updatestate.old=updating)
logs from journalctl -fu docker.service
level=debug msg="Calling GET /v1.40/services/frontend?insertDefaults=false"
level=debug msg="error handling rpc" error="rpc error: code = NotFound desc = service frontend not found" rpc=/docker.swarmkit.v1.Control/GetService
level=debug msg="form data: {\"EndpointSpec\":{\"Mode\":\"vip\"},\"Labels\":{\"com.docker.stack.image\":\"registry.gitlab.com/devteam/.frontend:1.0.5\",\"com.docker.stack.namespace\":\"\",\"traefik.docker.lbswarm\":\"true\",\"traefik.docker.network\":\"net\",\"traefik.enable\":\"true\",\"traefik.http.routers.frontend.entrypoints\":\"websecure\",\"traefik.http.routers.frontend.rule\":\"Host(`www.frontend.website`)\",\"traefik.http.routers.frontend.tls.certresolver\":\"myhttpchallenge\",\"traefik.http.services.frontend.loadbalancer.server.port\":\"80\"},\"Mode\":{\"Replicated\":{\"Replicas\":1}},\"Name\":\"frontend\",\"TaskTemplate\":{\"ContainerSpec\":{\"Image\":\"registry.gitlab.com/devteam/.frontend:1.0.5#sha256:e9a0d88bc14848c3b40c3d2905842313bbc648c1bbf09305f8935f9eb23f289a\",\"Isolation\":\"default\",\"Labels\":{\"com.docker.stack.namespace\":\"\"},\"Privileges\":{\"CredentialSpec\":null,\"SELinuxContext\":null}},\"ForceUpdate\":3,\"Networks\":[{\"Aliases\":[\"frontend\"],\"Target\":\"w7aqg3stebnmk5c5pbhgslh2d\"}],\"Placement\":{\"Platforms\":[{\"Architecture\":\"amd64\",\"OS\":\"linux\"}]},\"Resources\":{},\"RestartPolicy\":{\"Condition\":\"any\",\"MaxAttempts\":0},\"Runtime\":\"container\"}}"
level=debug msg="Calling GET /v1.40/services/frontend?insertDefaults=false"
level=debug msg="error handling rpc" error="rpc error: code = NotFound desc = service frontend not found" rpc=/docker.swarmkit.v1.Control/GetService
...
level=debug msg="Calling GET /v1.40/tasks?filters=%7B%22_up-to-date%22%3A%7B%22true%22%3Atrue%7D%2C%22service%22%3A%7B%22frontend%22%3Atrue%7D%7D"
level=debug msg="error handling rpc" error="rpc error: code = NotFound desc = service frontend not found" rpc=/docker.swarmkit.v1.Control/GetService
level=debug msg="handleEpTableEvent UPD e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98 R:{frontend.1.zeq4jz8kzle4c7vtzx5ofbrqo frontend n1wh16ru879699cpv3topcanc 10.0.1.32 10.0.1.46 [] [frontend] [f986fe859440] true}"
level=debug msg="rmServiceBinding from handleEpTableEvent START for frontend e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98 p:0xc005e1fa00 nid:w7aqg3stebnmk5c5pbhgslh2d sKey:{n1wh16ru879699cpv3topcanc } deleteSvc:true"
level=debug msg="deleteEndpointNameResolution e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98 frontend rm_service:false suppress:false sAliases:[frontend] tAliases:[f986fe859440]"
level=debug msg="delContainerNameResolution e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98 frontend.1.zeq4jz8kzle4c7vtzx5ofbrqo"
level=debug msg="e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98 (w7aqg3s).deleteSvcRecords(frontend.1.zeq4jz8kzle4c7vtzx5ofbrqo, 10.0.1.46, <nil>, true) rmServiceBinding sid:e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98 "
level=debug msg="e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98 (w7aqg3s).deleteSvcRecords(tasks.frontend, 10.0.1.46, <nil>, false) rmServiceBinding sid:n1wh16ru879699cpv3topcanc "
level=debug msg="rmServiceBinding from handleEpTableEvent END for frontend e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98"
level=debug msg="Calling GET /v1.40/services/frontend?insertDefaults=false"
level=debug msg="error handling rpc" error="rpc error: code = NotFound desc = service frontend not found" rpc=/docker.swarmkit.v1.Control/GetService
...
level=debug msg="Calling GET /v1.40/tasks?filters=%7B%22_up-to-date%22%3A%7B%22true%22%3Atrue%7D%2C%22service%22%3A%7B%22frontend%22%3Atrue%7D%7D"
level=debug msg="error handling rpc" error="rpc error: code = NotFound desc = service frontend not found" rpc=/docker.swarmkit.v1.Control/GetService
level=debug msg="handleEpTableEvent DEL e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98 R:{frontend.1.zeq4jz8kzle4c7vtzx5ofbrqo frontend n1wh16ru879699cpv3topcanc 10.0.1.32 10.0.1.46 [] [frontend] [f986fe859440] true}"
level=debug msg="rmServiceBinding from handleEpTableEvent START for frontend e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98 p:0xc005e1fa00 nid:w7aqg3stebnmk5c5pbhgslh2d sKey:{n1wh16ru879699cpv3topcanc } deleteSvc:true"
level=debug msg="deleteEndpointNameResolution e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98 frontend rm_service:true suppress:false sAliases:[frontend] tAliases:[f986fe859440]"
level=debug msg="delContainerNameResolution e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98 frontend.1.zeq4jz8kzle4c7vtzx5ofbrqo"
level=debug msg="e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98 (w7aqg3s).deleteSvcRecords(frontend.1.zeq4jz8kzle4c7vtzx5ofbrqo, 10.0.1.46, <nil>, true) rmServiceBinding sid:e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98 "
level=debug msg="e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98 (w7aqg3s).deleteSvcRecords(tasks.frontend, 10.0.1.46, <nil>, false) rmServiceBinding sid:n1wh16ru879699cpv3topcanc "
level=debug msg="e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98 (w7aqg3s).deleteSvcRecords(frontend, 10.0.1.32, <nil>, false) rmServiceBinding sid:n1wh16ru879699cpv3topcanc "
level=debug msg="rmServiceBinding from handleEpTableEvent END for frontend e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98"
level=debug msg="Calling GET /v1.40/services/frontend?insertDefaults=false"
level=debug msg="error handling rpc" error="rpc error: code = NotFound desc = service frontend not found" rpc=/docker.swarmkit.v1.Control/GetService
...
level=debug msg="Calling GET /v1.40/tasks?filters=%7B%22_up-to-date%22%3A%7B%22true%22%3Atrue%7D%2C%22service%22%3A%7B%22frontend%22%3Atrue%7D%7D"
level=debug msg="error handling rpc" error="rpc error: code = NotFound desc = service frontend not found" rpc=/docker.swarmkit.v1.Control/GetService
level=debug msg="handleEpTableEvent ADD 521ffeee31efe056900fb5a1fe73007c179594e964f703625cf3272eb14983c0 R:{frontend.1.1v9ggahd87x2ydlkna0qx7jmz frontend n1wh16ru879699cpv3topcanc 10.0.1.32 10.0.1.47 [] [frontend] [3671840709bb] false}"
level=debug msg="addServiceBinding from handleEpTableEvent START for frontend 521ffeee31efe056900fb5a1fe73007c179594e964f703625cf3272eb14983c0 p:0xc004a1ed80 nid:w7aqg3stebnmk5c5pbhgslh2d skey:{n1wh16ru879699cpv3topcanc }"
level=debug msg="addEndpointNameResolution 521ffeee31efe056900fb5a1fe73007c179594e964f703625cf3272eb14983c0 frontend add_service:true sAliases:[frontend] tAliases:[3671840709bb]"
level=debug msg="addContainerNameResolution 521ffeee31efe056900fb5a1fe73007c179594e964f703625cf3272eb14983c0 frontend.1.1v9ggahd87x2ydlkna0qx7jmz"
level=debug msg="521ffeee31efe056900fb5a1fe73007c179594e964f703625cf3272eb14983c0 (w7aqg3s).addSvcRecords(frontend.1.1v9ggahd87x2ydlkna0qx7jmz, 10.0.1.47, <nil>, true) addServiceBinding sid:521ffeee31efe056900fb5a1fe73007c179594e964f703625cf3272eb14983c0"
level=debug msg="521ffeee31efe056900fb5a1fe73007c179594e964f703625cf3272eb14983c0 (w7aqg3s).addSvcRecords(tasks.frontend, 10.0.1.47, <nil>, false) addServiceBinding sid:n1wh16ru879699cpv3topcanc"
level=debug msg="521ffeee31efe056900fb5a1fe73007c179594e964f703625cf3272eb14983c0 (w7aqg3s).addSvcRecords(frontend, 10.0.1.32, <nil>, false) addServiceBinding sid:n1wh16ru879699cpv3topcanc"
level=debug msg="addServiceBinding from handleEpTableEvent END for frontend 521ffeee31efe056900fb5a1fe73007c179594e964f703625cf3272eb14983c0"
level=debug msg="Calling GET /v1.40/services/frontend?insertDefaults=false"
level=debug msg="error handling rpc" error="rpc error: code = NotFound desc = service frontend not found" rpc=/docker.swarmkit.v1.Control/GetService
...
level=debug msg="Calling GET /v1.40/services/frontend?insertDefaults=false"
level=debug msg="error handling rpc" error="rpc error: code = NotFound desc = service frontend not found" rpc=/docker.swarmkit.v1.Control/GetService

Docker (daemon) on Windows 10 crashing when started: "hnsCall failed in Win32: The service has not been started", while hns service is happily running

I can't get docker to work on my Windows 10 machine all of a sudden, the docker daemon won't start. I can't seem to figure out what's going on, the debug output tells me that the hns service has not been started, but it is running.
This seems to only happen when I'm "switched" to running Windows containers, when I re-installed docker and it was running Linux containers, it seemed to work fine.
Ignore that (spoiler? SO doesn't want strikethrough text I guess), I just switched back to Linux containers and now get the same error.
This is the output of running dockerd from command line (with debug = true)
C:\Program Files\Docker\Docker\resources
λ dockerd
time="2019-10-31T11:03:23.425145500Z" level=info msg="Starting up"
time="2019-10-31T11:03:23.428147300Z" level=debug msg="Listener created for HTTP on npipe (//./pipe/docker_engine)"
time="2019-10-31T11:03:23.432148800Z" level=info msg="Windows default isolation mode: hyperv"
time="2019-10-31T11:03:23.432148800Z" level=debug msg="Stackdump - waiting signal at Global\\stackdump-19544"
time="2019-10-31T11:03:23.433145200Z" level=debug msg="Using default logging driver json-file"
time="2019-10-31T11:03:23.433145200Z" level=debug msg="[graphdriver] trying provided driver: windowsfilter"
time="2019-10-31T11:03:23.433145200Z" level=debug msg="WindowsGraphDriver InitFilter at C:\\ProgramData\\Docker\\windowsfilter"
time="2019-10-31T11:03:23.434161600Z" level=debug msg="Initialized graph driver windowsfilter"
time="2019-10-31T11:03:23.441159100Z" level=debug msg="Max Concurrent Downloads: 3"
time="2019-10-31T11:03:23.441159100Z" level=debug msg="Max Concurrent Uploads: 5"
time="2019-10-31T11:03:23.441159100Z" level=info msg="Loading containers: start."
time="2019-10-31T11:03:23.442144600Z" level=debug msg="Option Experimental: false"
time="2019-10-31T11:03:23.442144600Z" level=debug msg="Option DefaultDriver: nat"
time="2019-10-31T11:03:23.442144600Z" level=debug msg="Option DefaultNetwork: nat"
time="2019-10-31T11:03:23.442144600Z" level=debug msg="Network Control Plane MTU: 1500"
time="2019-10-31T11:03:23.442144600Z" level=info msg="Restoring existing overlay networks from HNS into docker"
time="2019-10-31T11:03:23.442144600Z" level=debug msg="[GET]=>[/networks/] Request : "
time="2019-10-31T11:03:23.447147300Z" level=debug msg="Network Response : [{\"ActivityId\":\"9A83FF02-21EB-49F0-879B-559444E6EC70\",\"AdditionalParams\":{},\"CurrentEndpointCount\":0,\"Extensions\":[{\"Id\":\"E7C3B2F0-F3C5-48DF-AF2B-10FED6D72E7A\",\"IsEnabled\":false,\"Name\":\"Microsoft Windows Filtering Platform\"},{\"Id\":\"E9B59CFA-2BE1-4B21-828F-B6FBDBDDC017\",\"IsEnabled\":false,\"Name\":\"Microsoft Azure VFP Switch Extension\"},{\"Id\":\"EA24CD6C-D17A-4348-9190-09F0D5BE83DD\",\"IsEnabled\":false,\"Name\":\"Microsoft NDIS Capture\"}],\"Flags\":0,\"Health\":{\"LastErrorCode\":0,\"LastUpdateTime\":132169931612174261},\"ID\":\"D63871DB-DF27-4EE6-80FB-6986CA4FDD2A\",\"IPv6\":false,\"LayeredOn\":\"7B01AE19-872A-416D-BA15-AF5CECD5F9E6\",\"MacPools\":[{\"EndMacAddress\":\"00-15-5D-74-AF-FF\",\"StartMacAddress\":\"00-15-5D-74-A0-00\"}],\"MaxConcurrentEndpoints\":0,\"Name\":\"My New Virtual Switch\",\"Policies\":[],\"State\":1,\"TotalEndpoints\":0,\"Type\":\"Transparent\",\"Version\":42949672963,\"Resources\":{\"AdditionalParams\":{},\"AllocationOrder\":0,\"CompartmentOperationTime\":0,\"Flags\":0,\"Health\":{\"LastErrorCode\":0,\"LastUpdateTime\":132169931612174261},\"ID\":\"9A83FF02-21EB-49F0-879B-559444E6EC70\",\"PortOperationTime\":0,\"State\":1,\"SwitchOperationTime\":0,\"VfpOperationTime\":0,\"parentId\":\"18DF5BED-03C6-4825-88D8-90F4DCB5473E\"}}]"
time="2019-10-31T11:03:23.449146500Z" level=debug msg="Network transparent (1c2f3a6) restored"
time="2019-10-31T11:03:23.462148900Z" level=debug msg="[GET]=>[/networks/] Request : "
time="2019-10-31T11:03:23.466149300Z" level=debug msg="Network Response : [{\"ActivityId\":\"9A83FF02-21EB-49F0-879B-559444E6EC70\",\"AdditionalParams\":{},\"CurrentEndpointCount\":0,\"Extensions\":[{\"Id\":\"E7C3B2F0-F3C5-48DF-AF2B-10FED6D72E7A\",\"IsEnabled\":false,\"Name\":\"Microsoft Windows Filtering Platform\"},{\"Id\":\"E9B59CFA-2BE1-4B21-828F-B6FBDBDDC017\",\"IsEnabled\":false,\"Name\":\"Microsoft Azure VFP Switch Extension\"},{\"Id\":\"EA24CD6C-D17A-4348-9190-09F0D5BE83DD\",\"IsEnabled\":false,\"Name\":\"Microsoft NDIS Capture\"}],\"Flags\":0,\"Health\":{\"LastErrorCode\":0,\"LastUpdateTime\":132169931612174261},\"ID\":\"D63871DB-DF27-4EE6-80FB-6986CA4FDD2A\",\"IPv6\":false,\"LayeredOn\":\"7B01AE19-872A-416D-BA15-AF5CECD5F9E6\",\"MacPools\":[{\"EndMacAddress\":\"00-15-5D-74-AF-FF\",\"StartMacAddress\":\"00-15-5D-74-A0-00\"}],\"MaxConcurrentEndpoints\":0,\"Name\":\"My New Virtual Switch\",\"Policies\":[],\"State\":1,\"TotalEndpoints\":0,\"Type\":\"Transparent\",\"Version\":42949672963,\"Resources\":{\"AdditionalParams\":{},\"AllocationOrder\":0,\"CompartmentOperationTime\":0,\"Flags\":0,\"Health\":{\"LastErrorCode\":0,\"LastUpdateTime\":132169931612174261},\"ID\":\"9A83FF02-21EB-49F0-879B-559444E6EC70\",\"PortOperationTime\":0,\"State\":1,\"SwitchOperationTime\":0,\"VfpOperationTime\":0,\"parentId\":\"18DF5BED-03C6-4825-88D8-90F4DCB5473E\"}}]"
time="2019-10-31T11:03:23.468145600Z" level=debug msg="Launching DNS server for network \"none\""
time="2019-10-31T11:03:23.477145800Z" level=debug msg="releasing IPv4 pools from network My New Virtual Switch (1c2f3a6ce8a7445896145d15e265b9eda4095d6f35c71ad872f3e733059940c6)"
time="2019-10-31T11:03:23.477145800Z" level=debug msg="ReleasePool(0.0.0.0/0)"
time="2019-10-31T11:03:23.480145100Z" level=debug msg="cleanupServiceDiscovery for network:1c2f3a6ce8a7445896145d15e265b9eda4095d6f35c71ad872f3e733059940c6"
time="2019-10-31T11:03:23.486145500Z" level=debug msg="Allocating IPv4 pools for network My New Virtual Switch (1c2f3a6ce8a7445896145d15e265b9eda4095d6f35c71ad872f3e733059940c6)"
time="2019-10-31T11:03:23.486145500Z" level=debug msg="RequestPool(LocalDefault, , , map[], false)"
time="2019-10-31T11:03:23.486145500Z" level=debug msg="RequestAddress(0.0.0.0/0, <nil>, map[RequestAddressType:com.docker.network.gateway])"
time="2019-10-31T11:03:23.487161600Z" level=debug msg="[GET]=>[/endpoints/] Request : "
time="2019-10-31T11:03:23.493144800Z" level=debug msg="Launching DNS server for network \"My New Virtual Switch\""
time="2019-10-31T11:03:23.493144800Z" level=debug msg="[GET]=>[/networks/D63871DB-DF27-4EE6-80FB-6986CA4FDD2A] Request : "
time="2019-10-31T11:03:23.497146600Z" level=debug msg="Network Response : {\"ActivityId\":\"9A83FF02-21EB-49F0-879B-559444E6EC70\",\"AdditionalParams\":{},\"CurrentEndpointCount\":0,\"Extensions\":[{\"Id\":\"E7C3B2F0-F3C5-48DF-AF2B-10FED6D72E7A\",\"IsEnabled\":false,\"Name\":\"Microsoft Windows Filtering Platform\"},{\"Id\":\"E9B59CFA-2BE1-4B21-828F-B6FBDBDDC017\",\"IsEnabled\":false,\"Name\":\"Microsoft Azure VFP Switch Extension\"},{\"Id\":\"EA24CD6C-D17A-4348-9190-09F0D5BE83DD\",\"IsEnabled\":false,\"Name\":\"Microsoft NDIS Capture\"}],\"Flags\":0,\"Health\":{\"LastErrorCode\":0,\"LastUpdateTime\":132169931612174261},\"ID\":\"D63871DB-DF27-4EE6-80FB-6986CA4FDD2A\",\"IPv6\":false,\"LayeredOn\":\"7B01AE19-872A-416D-BA15-AF5CECD5F9E6\",\"MacPools\":[{\"EndMacAddress\":\"00-15-5D-74-AF-FF\",\"StartMacAddress\":\"00-15-5D-74-A0-00\"}],\"MaxConcurrentEndpoints\":0,\"Name\":\"My New Virtual Switch\",\"Policies\":[],\"State\":1,\"TotalEndpoints\":0,\"Type\":\"Transparent\",\"Version\":42949672963,\"Resources\":{\"AdditionalParams\":{},\"AllocationOrder\":0,\"CompartmentOperationTime\":0,\"Flags\":0,\"Health\":{\"LastErrorCode\":0,\"LastUpdateTime\":132169931612174261},\"ID\":\"9A83FF02-21EB-49F0-879B-559444E6EC70\",\"PortOperationTime\":0,\"State\":1,\"SwitchOperationTime\":0,\"VfpOperationTime\":0,\"parentId\":\"18DF5BED-03C6-4825-88D8-90F4DCB5473E\"}}"
time="2019-10-31T11:03:23.507147200Z" level=debug msg="Allocating IPv4 pools for network nat (4456741d2c4fe47a5034db26ad9b1161c24ac18105ca4a71cf23cbe4fc6e3e88)"
time="2019-10-31T11:03:23.507147200Z" level=debug msg="RequestPool(LocalDefault, , , map[], false)"
time="2019-10-31T11:03:23.507147200Z" level=debug msg="RequestAddress(0.0.0.0/0, <nil>, map[RequestAddressType:com.docker.network.gateway])"
time="2019-10-31T11:03:23.507147200Z" level=debug msg="HNSNetwork Request ={\"Name\":\"nat\",\"Type\":\"nat\",\"Subnets\":[{\"AddressPrefix\":\"0.0.0.0/0\"}]} Address Space=[{0.0.0.0/0 []}]"
time="2019-10-31T11:03:23.507147200Z" level=debug msg="[POST]=>[/networks/] Request : {\"Name\":\"nat\",\"Type\":\"nat\",\"Subnets\":[{\"AddressPrefix\":\"0.0.0.0/0\"}]}"
time="2019-10-31T11:03:24.277127400Z" level=debug msg="releasing IPv4 pools from network nat (4456741d2c4fe47a5034db26ad9b1161c24ac18105ca4a71cf23cbe4fc6e3e88)"
time="2019-10-31T11:03:24.277127400Z" level=debug msg="ReleasePool(0.0.0.0/0)"
time="2019-10-31T11:03:24.277127400Z" level=debug msg="daemon configured with a 15 seconds minimum shutdown timeout" time="2019-10-31T11:03:24.278127100Z" level=debug msg="start clean shutdown of all containers with a 15 seconds timeout..."
failed to start daemon: Error initializing network controller: Error creating default network: hnsCall failed in Win32: The service has not been started. (0x426)
But this hns service is running already:
C:\Program Files\Docker\Docker\resources
λ net start hns
The requested service has already been started.
And in Task Manager > Services, I can see hns with Status: Running.
Any idea what's going on?
I've tried
uninstalling and re-installing docker
removing Hyper-V and re-adding Hyper-V to Windows,
added a new virtual switch in Hyper-V,
I saw something about removing a HNS data file, but there is no such file for me to remove (C:\ProgramData\Microsoft\Windows\HNS\ is empty),
starting and stopping hns service with net stop hns and net start hns.
I can't think of anything else to try besides just reformatting and starting up a new life with a family in Mexico.
I've uploaded like a thousand crash reports through the Docker for Windows interface, because when the service fails to start, they suggest that. I suspect some guy is very busy ignoring all of those. Obviously I expect nothing will come out of that, since there is no way for me to follow up on them.
Update 2020
I still get errors when switching to Windows containers, lol
Docker.Core.DockerException:
Error response from daemon: open \\.\pipe\docker_engine_windows: The system cannot find the file specified.
at Docker.Engines.DockerDaemonChecker.<CheckAsync>d__5.MoveNext() in C:\workspaces\stable-2.3.x\src\github.com\docker\pinata\win\src\Docker.Desktop\Engines\DockerDaemonChecker.cs:line 40
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Docker.Engines.WindowsContainersEngine.<DoStartAsync>d__12.MoveNext() in C:\workspaces\stable-2.3.x\src\github.com\docker\pinata\win\src\Docker.Desktop\Engines\WindowsContainersEngine.cs:line 52
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Docker.ApiServices.StateMachines.TaskExtensions.<WrapAsyncInCancellationException>d__0.MoveNext() in C:\workspaces\stable-2.3.x\src\github.com\docker\pinata\win\src\Docker.ApiServices\StateMachines\TaskExtensions.cs:line 29
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Docker.ApiServices.StateMachines.StartTransition.<DoRunAsync>d__5.MoveNext() in C:\workspaces\stable-2.3.x\src\github.com\docker\pinata\win\src\Docker.ApiServices\StateMachines\StartTransition.cs:line 67
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at Docker.ApiServices.StateMachines.StartTransition.<DoRunAsync>d__5.MoveNext() in C:\workspaces\stable-2.3.x\src\github.com\docker\pinata\win\src\Docker.ApiServices\StateMachines\StartTransition.cs:line 92
And resetting to Factory Defaults did nothing to fix this: I "reset", switched to Windows containers, and got same error.
Update 2020 vol. deux
I found a way to follow up on the crash reports! I created an issue, I've never been so excited in my life.
https://github.com/docker/for-win/issues/8226

How to fix docker daemon that will not restart due to hns error

Docker for Windows Server
Windows Server version 1709, with containers
Docker version 17.06.2-ee-6, build e75fdb8
Swarm mode (worker node, part of swarm with ubuntu masters)
After containers connected to an overlay network started intermittently losing their network adapters, I restarted the machine. Now daemon will not start. Below is the last lines of output from running docker -D.
Please let me know how to fix this.
time="2018-05-15T15:10:06.731160000Z" level=debug msg="Option Experimental: false"
time="2018-05-15T15:10:06.731160000Z" level=debug msg="Option DefaultDriver: nat"
time="2018-05-15T15:10:06.731160000Z" level=debug msg="Option DefaultNetwork: nat"
time="2018-05-15T15:10:06.734183700Z" level=info msg="Restoring existing overlay networks from HNS into docker"
time="2018-05-15T15:10:06.735174400Z" level=debug msg="[GET]=>[/networks/] Request : "
time="2018-05-15T15:12:06.789120400Z" level=debug msg="Network (d4d37ce) restored"
time="2018-05-15T15:12:06.796122200Z" level=debug msg="Endpoint (4114b6e) restored to network (d4d37ce)"
time="2018-05-15T15:12:06.796122200Z" level=debug msg="Endpoint (819eb70) restored to network (d4d37ce)"
time="2018-05-15T15:12:06.797124900Z" level=debug msg="Endpoint (ade55ea) restored to network (d4d37ce)"
time="2018-05-15T15:12:06.798125600Z" level=debug msg="Endpoint (d0054fc) restored to network (d4d37ce)"
time="2018-05-15T15:12:06.798125600Z" level=debug msg="Endpoint (e2af8d8) restored to network (d4d37ce)"
time="2018-05-15T15:12:06.854118500Z" level=debug msg="[GET]=>[/networks/] Request : "
time="2018-05-15T15:14:06.860654000Z" level=debug msg="start clean shutdown of all containers with a 15 seconds timeout..."
Error starting daemon: Error initializing network controller: hnsCall failed in Win32: Server execution failed (0x80080005)
Here is complete set of steps to completely rebuild all docker issues withing swarm host. Sometimes only some steps are sufficient (specifically hns part), so you can try those first.
Remove all docker services and user-defined networks (so all docker networks except `nat` and `none`
Leave the swarm cluster (docker swarm leave --force)
Stop the docker service (PS C:\> stop-service docker)
Stop the HNS service (PS C:\> stop-service hns)
In regedit, delete all of the registry keys under these paths:
HKLM:\SYSTEM\CurrentControlSet\Services\vmsmp\parameters\SwitchList
HKLM:\SYSTEM\CurrentControlSet\Services\vmsmp\parameters\NicList
Now go to Device Manager, and disable then remove all network adapters that are “Hyper-V Virtual Ethernet…” adapters
Now rename your HNS.data file (the goal is to effectively “delete” it by renaming it):
C:\ProgramData\Microsoft\Windows\HNS\HNS.data
Also rename C:\ProgramData\docker folder (the goal is to effectively “delete” it by renaming it)
C:\ProgramData\docker
Now reboot your machine

Login attempts to Nexus OSS Docker repo throwing 404

We are trying to set up a Docker repository in Nexus OSS (v3.3.2-02) in a Kubernetes cluster, and having issues logging in to it. We are intending to have a proxy set up for DockerHub, a private repo, and a group repo to tie the two together, using the below configurations
Hosted
Proxy
Group
giving us the following list:
But when I try to log in to the repository, it appears it's trying to forward me to a /v2 endpoint, which is throwing a 404 error:
> docker login -u <user> -p <pass> https://repo.myhost.com:443
Error response from daemon: login attempt to https://repo.myhost.com:443/v2/ failed with status: 404 Not Found
I would like to add that we have Maven and NPM repositories set up in this same instance and they're working, so it appears Nexus itself is OK, but there's something wrong with the Docker configuration.
I don't know why this request is trying to send me to the /v2 endpoint when trying to log in. What am I missing?
Docker requires very specific URL layout and does not allow for any context URL hence the need for Docker connectors to allow Docker client to connect to NXRM. Your screenshot shows you have configured Docker connector for your Docker hosted repository on port 444, but your terminal capture shows you're attempting to connect on port 443 which isn't your Docker connector port. The error message you have suggest your NXRM server indeed runs on port 443, but because of how Docker works you need to access it using port 444. Please try: docker login -u <user> -p <pass> https://repo.myhost.com:444 so it attempts to use your Docker connector port. Also, it's always a good idea to run the latest version of Nexus.
In an experiment I just ran (docker-machine, virtualbox, macOS), when the server was 1.13.1 (as was the docker cli), it made a graceful degradation from /v2 down to /v1, like so:
level=debug msg="Calling GET /_ping"
level=debug msg="Calling GET /v1.26/version"
level=debug msg="Calling GET /_ping"
level=debug msg="Calling GET /v1.26/version"
level=debug msg="Calling GET /_ping"
level=debug msg="Calling GET /v1.26/info"
level=debug msg="Calling POST /v1.26/auth"
level=debug msg="attempting v2 login to registry endpoint https://192.168.2.103:9999/v2/"
level=info msg="Error logging in to v2 endpoint, trying next endpoint: Get https://192.168.2.103:9999/v2/: EOF"
level=debug msg="attempting v1 login to registry endpoint https://192.168.2.103:9999/v1/"
level=info msg="Error logging in to v1 endpoint, trying next endpoint: Get https://192.168.2.103:9999/v1/users/: dial tcp 192.168.2.103:9999: getsockopt: connection refused"
level=error msg="Handler for POST /v1.26/auth returned error: Get https://192.168.2.103:9999/v1/users/: dial tcp 192.168.2.103:9999: getsockopt: connection refused"
but after I upgraded the server to 17.06.0-ce (still with 1.13.1 cli), it only attempted /v2 and then quit:
level=debug msg="Calling GET /_ping"
level=debug msg="Calling GET /_ping"
level=debug msg="Calling GET /v1.30/version"
level=debug msg="Calling GET /_ping"
level=debug msg="Calling GET /v1.30/info"
level=debug msg="Calling POST /v1.30/auth"
level=debug msg="attempting v2 login to registry endpoint https://192.168.2.103:9999/v2/"
level=info msg="Error logging in to v2 endpoint, trying next endpoint: Get https://192.168.2.103:9999/v2/: tls: oversized record received with length 21584"
level=error msg="Handler for POST /v1.30/auth returned error: Get https://192.168.2.103:9999/v2/: tls: oversized record received with length 21584"
So the answer appears to be that one either needs to teach Nexus to respond correctly to the /v2 endpoints (as it really should be doing already), or downgrade the dockerd back down to a version that speaks the /v1 api if that is the behavior you're after
Not sure if this is going to help, but the browser based URL does not have port number in it, and could login with my credentials. Example browser based URL below.
https://nexus.mysite.net/
However I had to key in the following
docker login -u -p https://nexus.mysite.net/
I am greeted with the following
Error response from daemon: login attempt to https://nexus.mysite.net/v2/ failed with status: 404 Not Found
Giving the right port number did not show up the above error and I could login from the CLI as follows.
docker login -u the-user-name -p the-password https://nexus.mysite.net:7000
(in my case the correct port number was 7000).
Hope this helps.

Why does "docker login" fail in Docker Quickstart Terminal but work from within the default machine?

I've installed Docker Toolbox in Windows 8.1 and have been following the installation tutorial. When getting to the step where you create and push your own image, I got this error when I attempted to run docker login ... .
### VIA Docker Quickstart Terminal
### docker login --username=myuser --password="mypass" --email=myemail#gmail.com
time="2015-11-17T03:20:58.160803558Z" level=debug msg="Calling POST /v1.21/auth"
time="2015-11-17T03:20:58.160838971Z" level=info msg="POST /v1.21/auth"
time="2015-11-17T03:20:58.169033324Z" level=debug msg="hostDir: /etc/docker/certs.d/https:/registry-win-tp3.docker.io/v1"
time="2015-11-17T03:20:58.169071565Z" level=debug msg="pinging registry endpoint https://registry-win-tp3.docker.io/v1/"
time="2015-11-17T03:20:58.169084660Z" level=debug msg="attempting v1 ping for registry endpoint https://registry-win-tp3.docker.io/v1/"
time="2015-11-17T03:20:58.898542338Z" level=debug msg="Error unmarshalling the _ping PingResult: invalid character '<' looking for beginning of value"
time="2015-11-17T03:20:58.898803841Z" level=debug msg="PingResult.Version: \"\""
time="2015-11-17T03:20:58.898818084Z" level=debug msg="Registry standalone header: ''"
time="2015-11-17T03:20:58.898836197Z" level=debug msg="PingResult.Standalone: true"
time="2015-11-17T03:20:58.898853685Z" level=debug msg="attempting v1 login to registry endpoint https://registry-win-tp3.docker.io/v1/"
time="2015-11-17T03:20:59.478756938Z" level=error msg="Handler for POST /v1.21/auth returned error: Unexpected status code [403] : <html><body><h1>403 Forbidden</h1>\nRequest forbidden by administrative rules.\n</body></html>\n\n"
time="2015-11-17T03:20:59.478815334Z" level=error msg="HTTP Error" err="Unexpected status code [403] : <html><body><h1>403 Forbidden</h1>\nRequest forbidden by administrative rules.\n</body></html>\n\n" statusCode=500
Trying to solve the issue, I tried running docker login ... from within the Docker default VM. And there it works!
### VIA default virtual machine (192.168.99.100)
### docker login --username=myuser --password="mypass" --email=myemail#gmail.com https://index.docker.io/v1/
time="2015-11-17T03:20:46.053333255Z" level=debug msg="Calling POST /v1.21/auth"
time="2015-11-17T03:20:46.053404176Z" level=info msg="POST /v1.21/auth"
time="2015-11-17T03:20:46.082796012Z" level=debug msg="hostDir: /etc/docker/certs.d/https:/index.docker.io/v1"
time="2015-11-17T03:20:46.082930763Z" level=debug msg="pinging registry endpoint https://index.docker.io/v1/"
time="2015-11-17T03:20:46.082946790Z" level=debug msg="attempting v1 ping for registry endpoint https://index.docker.io/v1/"
time="2015-11-17T03:20:46.082959103Z" level=debug msg="attempting v1 login to registry endpoint https://index.docker.io/v1/"
I notice that they're using two different URLs and that the first one encounters a parsing error. The credentials are obviously correct since they work from within the VM, unless the two domains don't share users. Are the URLs or the response being mangled by MINGW64?
Update February 2016
PR 19891 "Enable cross-platforms login to Registry" is supposed to fixed the issue
Use a daemon-defined Registry URL for docker login.
This allows a Windows client interacting with a Linux daemon to properly use the default Registry endpoint instead of the Windows specific one.
It is in commit 19eaa71 (maybe for docker 1.10?)
This is reported both in docker/docker issue 15612 and docker/docker issue 18019
After some analysis of the source code I’ve detected that we have different registry URLs for Windows and UNIX.
Windows: https://registry-win-tp3.docker.io/v1/
Unix: https://index.docker.io/v1/
The Windows url comes from a recent PR 15417 with the comment:
// Currently it is a TEMPORARY link that allows Microsoft to continue
// development of Docker Engine for Windows.
So it is possible this url won't work (unless you are on a very recent Windows Server 2016)
There seems to be a workaround in docker/hub-feedback issues 473, which involves:
specifying the default index registry of docker io,
docker login --username=myuser --password=mypassword --email=myemail https://index.docker.io/v1/
WARNING: login credentials saved in C:\Users\myuser\.docker\config.json
Login Succeeded
modifying the config.json file created by the previous step, in order to add the same credentials for index.docker.io for the registry-win:
config.json:
{
"auths": {
"https://index.docker.io/v1/": {
"auth": "myhash",
"email": "myemail"
},
"https://registry-win-tp3.docker.io/v1/": {
"auth": "myhash",
"email": "mydomain"
}
}
}
After that, a docker push index.docker.io/myuser/myrepo:latest does work.

Resources