For a school project, I have set up a watchtower to automatically update the frontend on update.
However, it seems that Watchtower cannot pull the image, it throws a 401 unauthorized error.
My docker-compose.yml is the following
version: '3'
services:
about_website:
image: rg.nl-ams.scw.cloud/csc648-team01/about_website
command: "-l 3000"
ports:
- "3000:3000"
labels:
com.centurylinklabs.watchtower.enable: true
com.centurylinklabs.watchtower.scope: about_website
watchtower:
image: containrrr/watchtower:1.5.3
volumes:
- /var/run/docker.sock:/var/run/docker.sock
command: --interval 5 --cleanup --label-enable --debug --scope about_website
labels:
com.centurylinklabs.watchtower.scope: about_website
The image is public and can be pulled using
docker pull rg.nl-ams.scw.cloud/csc648-team01/about_website:latest
latest: Pulling from csc648-team01/about_website
Digest: sha256:99c55eb14bdcc3724e2ca751bb2c6e0d7633077b2657b4a69e0a30be0093c1c4
Status: Image is up to date for rg.nl-ams.scw.cloud/csc648-team01/about_website:latest
rg.nl-ams.scw.cloud/csc648-team01/about_website:latest
However, Watchtower isn't able to pull it, I can see in logs that it throws a 401 error
time="2023-02-13T22:02:34Z" level=debug msg="Retrieving running containers"
2023-02-13T22:02:34.065154177Z time="2023-02-13T22:02:34Z" level=debug msg="Trying to load authentication credentials." container=/about_website-about_website-1 image="rg.nl-ams.scw.cloud/csc648-team01/about_website:latest"
2023-02-13T22:02:34.065371344Z time="2023-02-13T22:02:34Z" level=debug msg="No credentials for rg.nl-ams.scw.cloud found" config_file=/config.json
2023-02-13T22:02:34.065384885Z time="2023-02-13T22:02:34Z" level=debug msg="Got image name: rg.nl-ams.scw.cloud/csc648-team01/about_website:latest"
2023-02-13T22:02:34.065388719Z time="2023-02-13T22:02:34Z" level=debug msg="Checking if pull is needed" container=/about_website-about_website-1 image="rg.nl-ams.scw.cloud/csc648-team01/about_website:latest"
2023-02-13T22:02:34.065391969Z time="2023-02-13T22:02:34Z" level=debug msg="Building challenge URL" URL="https://rg.nl-ams.scw.cloud/v2/"
2023-02-13T22:02:34.717871469Z time="2023-02-13T22:02:34Z" level=debug msg="Got response to challenge request" header="Bearer realm=\"https://api.scaleway.com/registry-internal/v1/regions/nl-ams/tokens\",service=\"registry\"" status="401 Unauthorized"
2023-02-13T22:02:34.718049427Z time="2023-02-13T22:02:34Z" level=debug msg="Checking challenge header content" realm="https://api.scaleway.com/registry-internal/v1/regions/nl-ams/tokens" service=registry
2023-02-13T22:02:34.718054969Z time="2023-02-13T22:02:34Z" level=debug msg="Setting scope for auth token" image=rg.nl-ams.scw.cloud/csc648-team01/about_website scope="repository:rg.nl-ams.scw.cloud/csc648-team01/about_website:pull"
2023-02-13T22:02:34.718060552Z time="2023-02-13T22:02:34Z" level=debug msg="No credentials found."
2023-02-13T22:02:35.429569719Z time="2023-02-13T22:02:35Z" level=debug msg="Parsing image ref" host=rg.nl-ams.scw.cloud image=csc648-team01/about_website normalized="rg.nl-ams.scw.cloud/csc648-team01/about_website:latest" tag=latest
2023-02-13T22:02:35.429658511Z time="2023-02-13T22:02:35Z" level=debug msg="Doing a HEAD request to fetch a digest" url="https://rg.nl-ams.scw.cloud/v2/csc648-team01/about_website/manifests/latest"
2023-02-13T22:02:36.083084511Z time="2023-02-13T22:02:36Z" level=debug msg="Could not do a head request for \"rg.nl-ams.scw.cloud/csc648-team01/about_website:latest\", falling back to regular pull." container=/about_website-about_website-1 image="rg.nl-ams.scw.cloud/csc648-team01/about_website:latest"
2023-02-13T22:02:36.085633136Z time="2023-02-13T22:02:36Z" level=debug msg="Reason: registry responded to head request with \"401 Unauthorized\", auth: \"Bearer realm=\\\"https://api.scaleway.com/registry-internal/v1/regions/nl-ams/tokens\\\",service=\\\"registry\\\",scope=\\\"repository:csc648-team01/about_website:pull\\\",error=\\\"invalid_token\\\"\"" container=/about_website-about_website-1 image="rg.nl-ams.scw.cloud/csc648-team01/about_website:latest"
2023-02-13T22:02:36.087838428Z time="2023-02-13T22:02:36Z" level=debug msg="Pulling image" container=/about_website-about_website-1 image="rg.nl-ams.scw.cloud/csc648-team01/about_website:latest"
2023-02-13T22:02:38.074009471Z
time="2023-02-13T22:02:38Z" level=debug msg="No new images found for /about_website-about_website-1"
2023-02-13T22:02:38.074042429Z time="2023-02-13T22:02:38Z" level=info msg="Session done" Failed=0 Scanned=1 Updated=0 notify=no
EDIT I tried to host my image on Docker Hub and it works fine.
It seems the issue come from Scaleway :(
EDIT
NEVEMIND THIS QUESTION. I found that one of my services, which is using a Docker.DotNet, was terminating the services marked as Shutdown. I've corrected the bug and have regained my trust in Docker and Docker Swarm.
Thank you Carlos for you help. My bad, my fault. Sorry for that!
I have 13 services configured on a docker-compose file and running in Swarm mode with one manager and two worker nodes.
Then I make one of the worker nodes unavailable by draining it
docker node update --availability drain ****-v3-6by7ddst
What I notice is that all the services that where running on the drained node are removed and not scheduled to the available node.
The available worker and manager nodes still have plenty of resources.. The services are simply removed.I am now down to 9 services
Looking at the logs I see stuff like bellow but repeated with different service ids
level=warning msg="Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap."
level=error msg="Error getting service u68b1fofzb3nefpnasctpywav: service u68b1fofzb3nefpnasctpywav not found"
level=warning msg="rmServiceBinding 021eda460c5744fd4d499475e5aa0f1cfbe5df479e5b21389ed1b501a93b47e1 possible transient state ok:false entries:0 set:false "
Then, for debug purposes I set my node back to available
docker node update --availability active ****-v3-6by7ddst
Then I try to balance some of the services to the newly available node. And this is the result.
I get the same error on the logs
level=error msg="Error getting service ****_frontend: service ****_frontend not found"
level=warning msg="rmServiceBinding 6bb220c0a95b30cdb3ff7b577c7e9dec7ad6383b34aff85e1685e94e7486e3ea possible transient state ok:false entries:0 set:false "
msg="Error getting service l29wlucttul75pzqo2sgr0u9e: service l29wlucttul75pzqo2sgr0u9e not found"
On my docker-compose file I am configuring all my services like this. Restart policy is any.
frontend:
image: {FRONTEND_IMAGE}
deploy:
labels:
- "traefik.enable=true"
- "traefik.docker.lbswarm=true"
- "traefik.http.routers.frontend.rule=Host(`${FRONTEND_HOST}`)"
- "traefik.http.routers.frontend.entrypoints=websecure"
- "traefik.http.routers.frontend.tls.certresolver=myhttpchallenge"
- "traefik.http.services.frontend.loadbalancer.server.port=80"
- "traefik.docker.network=ingress"
replicas: 1
resources:
limits:
memory: ${FRONTEND_LIMITS_MEMORY}
cpus: ${FRONTEND_LIMITS_CPUS}
reservations:
memory: ${FRONTEND_RESERVATION_MEMORY}
cpus: ${FRONTEND_RESERVATION_CPUS}
restart_policy:
condition: any
networks:
- ingress
Something fails while recreating services on different nodes, and even with only one manager/worker node I get the same result.
The rest seems to work fine. As an example, if I scale a service it works well.
New Edit
Just did another test.
This time I only have two services, traefik and front-end.
One instance for traefik
4 instances for front-end
two nodes (one manager and one worker)
Drained worker node and front-end instances running on the drained node are moved to the manager node
Activated back the worker node
Did a docker service update cords_frontend --force and two instances of front-end are killed on the manager node and are placed running on the worker node.
So, with this test with only two services everything works fine.
Is there any kind of limit to the number of services and stack should have?
Any clues why this is happening?
Thanks
Hugo
I believe you may be running into an issue with resource reservations. You mention that the nodes available have plenty of resources, but the way reservations work, a service will not be scheduled if it can't reserve the resources specified, very important to note that this has nothing to do with how much resources the service is actually using. This means that if you specify a reservation you are basically saying that service will reserve that amount of resources and those resources are not available for other services to use. So if all your services have similar reservations you may be running into a situation where even though the node shows available resources, those resources are in fact reserved by the existing service. So I would suggest you remove the reservations section and try it to see if that is in fact what is happening.
So I am still struggling with this.
As a recap, I have on docker stack with 13 services running in swarm mode with two nodes (manager+worker). Each node has 4 cores and 8GB of RAM (Ubuntu 18.04, docker 19.03.12).
If a node dies, or I drain a node, all the services running on this node die and are marked are Removed. If I simply run docker service update front_end --force the service also dies and is marked as removed.
Another important detail is, if I sum up all the reserved memory and cores from the 13 services I end up with 1.9 cores and 4GB of RAM, way bellow each of the nodes resources.
I don't see any out of memory on the containers, services or stack logs. Also, by using htop tool I can see that memory usage is using 647MB/7.79GB on Manager node and 2GB/7.79GB on the worker node.
This is what i tried so far:
separated the 13 services into two different stacks. No luck.
removed all the reservations tags from the compose files. No luck.
tried running with 3 nodes. No luck.
I was seeing this warning WARNING: No swap limit support so I followed the suggestions on this document on both nodes enter link description here. No luck.
Upped both node resources to 8 cores and 16BG of RAM. No luck.
tried starting each service one at the time, and I noticed it starts behaving badly with 10 or more services. That is to say, everything works fine if I have up to 9 services running, after this I see the behaviour described above.
Also, I enabled docker's debug mode to see what was happening. Here are the outputs.
If I run docker service update front_end --force and front_end service dies, this is the output form docker events
service update k6a7go4uhexb4b1u1fp98dtke (name=frontend)
service update k6a7go4uhexb4b1u1fp98dtke (name=frontend, updatestate.new=updating)
service remove k6a7go4uhexb4b1u1fp98dtke (name=frontend)
logs from journalctl -fu docker.service
level=debug msg="Calling GET /v1.40/services/frontend?insertDefaults=false"
level=debug msg="error handling rpc" error="rpc error: code = NotFound desc = service frontend not found" rpc=/docker.swarmkit.v1.Control/GetService
level=debug msg="form data: {\"EndpointSpec\":{\"Mode\":\"vip\"},\"Labels\":{\"com.docker.stack.image\":\"registry.gitlab.com/devteam/.frontend:1.0.5\",\"com.docker.stack.namespace\":\"\",\"traefik.docker.lbswarm\":\"true\",\"traefik.docker.network\":\"net\",\"traefik.enable\":\"true\",\"traefik.http.routers.frontend.entrypoints\":\"websecure\",\"traefik.http.routers.frontend.rule\":\"Host(`www.frontend.website`)\",\"traefik.http.routers.frontend.tls.certresolver\":\"myhttpchallenge\",\"traefik.http.services.frontend.loadbalancer.server.port\":\"80\"},\"Mode\":{\"Replicated\":{\"Replicas\":1}},\"Name\":\"frontend\",\"TaskTemplate\":{\"ContainerSpec\":{\"Image\":\"registry.gitlab.com/fdevteam/frontend:1.0.5#sha256:e9a0d88bc14848c3b40c3d2905842313bbc648c1bbf09305f8935f9eb23f289a\",\"Isolation\":\"default\",\"Labels\":{\"com.docker.stack.namespace\":\"f\"},\"Privileges\":{\"CredentialSpec\":null,\"SELinuxContext\":null}},\"ForceUpdate\":1,\"Networks\":[{\"Aliases\":[\"frontend\"],\"Target\":\"w7aqg3stebnmk5c5pbhgslh2d\"}],\"Placement\":{\"Platforms\":[{\"Architecture\":\"amd64\",\"OS\":\"linux\"}]},\"Resources\":{},\"RestartPolicy\":{\"Condition\":\"any\",\"MaxAttempts\":0},\"Runtime\":\"container\"}}"
level=debug msg="Calling GET /v1.40/services/frontend?insertDefaults=false"
level=debug msg="error handling rpc" error="rpc error: code = NotFound desc = service frontend not found" rpc=/docker.swarmkit.v1.Control/GetService
...
level=debug msg="Calling GET /v1.40/tasks?filters=%7B%22_up-to-date%22%3A%7B%22true%22%3Atrue%7D%2C%22service%22%3A%7B%22frontend%22%3Atrue%7D%7D"
level=debug msg="error handling rpc" error="rpc error: code = NotFound desc = service frontend not found" rpc=/docker.swarmkit.v1.Control/GetService
level=debug msg="handleEpTableEvent UPD 6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745 R:{frontend.1.lv7tjjaev45pvn0f7qtppb21r frontend nnlg81dsspnj6oxip4iqwwjc3 10.0.1.73 10.0.1.74 [] [frontend] [e661c9f39097] true}"
level=debug msg="rmServiceBinding from handleEpTableEvent START for frontend 6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745 p:0xc004a1f880 nid:w7aqg3stebnmk5c5pbhgslh2d sKey:{nnlg81dsspnj6oxip4iqwwjc3 } deleteSvc:true"
level=debug msg="deleteEndpointNameResolution 6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745 frontend rm_service:false suppress:false sAliases:[frontend] tAliases:[e661c9f39097]"
level=debug msg="delContainerNameResolution 6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745 frontend.1.lv7tjjaev45pvn0f7qtppb21r"
level=debug msg="6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745 (w7aqg3s).deleteSvcRecords(frontend.1.lv7tjjaev45pvn0f7qtppb21r, 10.0.1.74, <nil>, true) rmServiceBinding sid:6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745 "
level=debug msg="6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745 (w7aqg3s).deleteSvcRecords(tasks.frontend, 10.0.1.74, <nil>, false) rmServiceBinding sid:nnlg81dsspnj6oxip4iqwwjc3 "
level=debug msg="rmServiceBinding from handleEpTableEvent END for frontend 6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745"
level=debug msg="Calling GET /v1.40/services/frontend?insertDefaults=false"
level=debug msg="error handling rpc" error="rpc error: code = NotFound desc = service frontend not found" rpc=/docker.swarmkit.v1.Control/GetService
level=error msg="Error getting service frontend: service frontend not found"
level=debug msg="handleEpTableEvent DEL 6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745 R:{frontend.1.lv7tjjaev45pvn0f7qtppb21r frontend nnlg81dsspnj6oxip4iqwwjc3 10.0.1.73 10.0.1.74 [] [frontend] [e661c9f39097] true}"
level=debug msg="rmServiceBinding from handleEpTableEvent START for frontend 6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745 p:0xc004a1f880 nid:w7aqg3stebnmk5c5pbhgslh2d sKey:{nnlg81dsspnj6oxip4iqwwjc3 } deleteSvc:true"
level=debug msg="deleteEndpointNameResolution 6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745 frontend rm_service:true suppress:false sAliases:[frontend] tAliases:[e661c9f39097]"
level=debug msg="delContainerNameResolution 6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745 frontend.1.lv7tjjaev45pvn0f7qtppb21r"
level=debug msg="6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745 (w7aqg3s).deleteSvcRecords(frontend.1.lv7tjjaev45pvn0f7qtppb21r, 10.0.1.74, <nil>, true) rmServiceBinding sid:6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745 "
level=debug msg="6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745 (w7aqg3s).deleteSvcRecords(tasks.frontend, 10.0.1.74, <nil>, false) rmServiceBinding sid:nnlg81dsspnj6oxip4iqwwjc3 "
level=debug msg="6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745 (w7aqg3s).deleteSvcRecords(frontend, 10.0.1.73, <nil>, false) rmServiceBinding sid:nnlg81dsspnj6oxip4iqwwjc3 "
level=debug msg="rmServiceBinding from handleEpTableEvent END for frontend 6b20c2924ec1eafa20c27d572019207551819b10a2c4f8d0574f2e142274c745"
If the service does not die (that is the case with 9 or less services) this is the ouput:
service update n1wh16ru879699cpv3topcanc (name=frontend)
service update n1wh16ru879699cpv3topcanc (name=frontend, updatestate.new=updating)
service update n1wh16ru879699cpv3topcanc (name=frontend, updatestate.new=completed, updatestate.old=updating)
logs from journalctl -fu docker.service
level=debug msg="Calling GET /v1.40/services/frontend?insertDefaults=false"
level=debug msg="error handling rpc" error="rpc error: code = NotFound desc = service frontend not found" rpc=/docker.swarmkit.v1.Control/GetService
level=debug msg="form data: {\"EndpointSpec\":{\"Mode\":\"vip\"},\"Labels\":{\"com.docker.stack.image\":\"registry.gitlab.com/devteam/.frontend:1.0.5\",\"com.docker.stack.namespace\":\"\",\"traefik.docker.lbswarm\":\"true\",\"traefik.docker.network\":\"net\",\"traefik.enable\":\"true\",\"traefik.http.routers.frontend.entrypoints\":\"websecure\",\"traefik.http.routers.frontend.rule\":\"Host(`www.frontend.website`)\",\"traefik.http.routers.frontend.tls.certresolver\":\"myhttpchallenge\",\"traefik.http.services.frontend.loadbalancer.server.port\":\"80\"},\"Mode\":{\"Replicated\":{\"Replicas\":1}},\"Name\":\"frontend\",\"TaskTemplate\":{\"ContainerSpec\":{\"Image\":\"registry.gitlab.com/devteam/.frontend:1.0.5#sha256:e9a0d88bc14848c3b40c3d2905842313bbc648c1bbf09305f8935f9eb23f289a\",\"Isolation\":\"default\",\"Labels\":{\"com.docker.stack.namespace\":\"\"},\"Privileges\":{\"CredentialSpec\":null,\"SELinuxContext\":null}},\"ForceUpdate\":3,\"Networks\":[{\"Aliases\":[\"frontend\"],\"Target\":\"w7aqg3stebnmk5c5pbhgslh2d\"}],\"Placement\":{\"Platforms\":[{\"Architecture\":\"amd64\",\"OS\":\"linux\"}]},\"Resources\":{},\"RestartPolicy\":{\"Condition\":\"any\",\"MaxAttempts\":0},\"Runtime\":\"container\"}}"
level=debug msg="Calling GET /v1.40/services/frontend?insertDefaults=false"
level=debug msg="error handling rpc" error="rpc error: code = NotFound desc = service frontend not found" rpc=/docker.swarmkit.v1.Control/GetService
...
level=debug msg="Calling GET /v1.40/tasks?filters=%7B%22_up-to-date%22%3A%7B%22true%22%3Atrue%7D%2C%22service%22%3A%7B%22frontend%22%3Atrue%7D%7D"
level=debug msg="error handling rpc" error="rpc error: code = NotFound desc = service frontend not found" rpc=/docker.swarmkit.v1.Control/GetService
level=debug msg="handleEpTableEvent UPD e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98 R:{frontend.1.zeq4jz8kzle4c7vtzx5ofbrqo frontend n1wh16ru879699cpv3topcanc 10.0.1.32 10.0.1.46 [] [frontend] [f986fe859440] true}"
level=debug msg="rmServiceBinding from handleEpTableEvent START for frontend e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98 p:0xc005e1fa00 nid:w7aqg3stebnmk5c5pbhgslh2d sKey:{n1wh16ru879699cpv3topcanc } deleteSvc:true"
level=debug msg="deleteEndpointNameResolution e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98 frontend rm_service:false suppress:false sAliases:[frontend] tAliases:[f986fe859440]"
level=debug msg="delContainerNameResolution e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98 frontend.1.zeq4jz8kzle4c7vtzx5ofbrqo"
level=debug msg="e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98 (w7aqg3s).deleteSvcRecords(frontend.1.zeq4jz8kzle4c7vtzx5ofbrqo, 10.0.1.46, <nil>, true) rmServiceBinding sid:e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98 "
level=debug msg="e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98 (w7aqg3s).deleteSvcRecords(tasks.frontend, 10.0.1.46, <nil>, false) rmServiceBinding sid:n1wh16ru879699cpv3topcanc "
level=debug msg="rmServiceBinding from handleEpTableEvent END for frontend e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98"
level=debug msg="Calling GET /v1.40/services/frontend?insertDefaults=false"
level=debug msg="error handling rpc" error="rpc error: code = NotFound desc = service frontend not found" rpc=/docker.swarmkit.v1.Control/GetService
...
level=debug msg="Calling GET /v1.40/tasks?filters=%7B%22_up-to-date%22%3A%7B%22true%22%3Atrue%7D%2C%22service%22%3A%7B%22frontend%22%3Atrue%7D%7D"
level=debug msg="error handling rpc" error="rpc error: code = NotFound desc = service frontend not found" rpc=/docker.swarmkit.v1.Control/GetService
level=debug msg="handleEpTableEvent DEL e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98 R:{frontend.1.zeq4jz8kzle4c7vtzx5ofbrqo frontend n1wh16ru879699cpv3topcanc 10.0.1.32 10.0.1.46 [] [frontend] [f986fe859440] true}"
level=debug msg="rmServiceBinding from handleEpTableEvent START for frontend e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98 p:0xc005e1fa00 nid:w7aqg3stebnmk5c5pbhgslh2d sKey:{n1wh16ru879699cpv3topcanc } deleteSvc:true"
level=debug msg="deleteEndpointNameResolution e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98 frontend rm_service:true suppress:false sAliases:[frontend] tAliases:[f986fe859440]"
level=debug msg="delContainerNameResolution e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98 frontend.1.zeq4jz8kzle4c7vtzx5ofbrqo"
level=debug msg="e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98 (w7aqg3s).deleteSvcRecords(frontend.1.zeq4jz8kzle4c7vtzx5ofbrqo, 10.0.1.46, <nil>, true) rmServiceBinding sid:e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98 "
level=debug msg="e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98 (w7aqg3s).deleteSvcRecords(tasks.frontend, 10.0.1.46, <nil>, false) rmServiceBinding sid:n1wh16ru879699cpv3topcanc "
level=debug msg="e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98 (w7aqg3s).deleteSvcRecords(frontend, 10.0.1.32, <nil>, false) rmServiceBinding sid:n1wh16ru879699cpv3topcanc "
level=debug msg="rmServiceBinding from handleEpTableEvent END for frontend e21b861c447ffd78bd2014744c13a146accd4600412c12b8cccfe3f3af4f0b98"
level=debug msg="Calling GET /v1.40/services/frontend?insertDefaults=false"
level=debug msg="error handling rpc" error="rpc error: code = NotFound desc = service frontend not found" rpc=/docker.swarmkit.v1.Control/GetService
...
level=debug msg="Calling GET /v1.40/tasks?filters=%7B%22_up-to-date%22%3A%7B%22true%22%3Atrue%7D%2C%22service%22%3A%7B%22frontend%22%3Atrue%7D%7D"
level=debug msg="error handling rpc" error="rpc error: code = NotFound desc = service frontend not found" rpc=/docker.swarmkit.v1.Control/GetService
level=debug msg="handleEpTableEvent ADD 521ffeee31efe056900fb5a1fe73007c179594e964f703625cf3272eb14983c0 R:{frontend.1.1v9ggahd87x2ydlkna0qx7jmz frontend n1wh16ru879699cpv3topcanc 10.0.1.32 10.0.1.47 [] [frontend] [3671840709bb] false}"
level=debug msg="addServiceBinding from handleEpTableEvent START for frontend 521ffeee31efe056900fb5a1fe73007c179594e964f703625cf3272eb14983c0 p:0xc004a1ed80 nid:w7aqg3stebnmk5c5pbhgslh2d skey:{n1wh16ru879699cpv3topcanc }"
level=debug msg="addEndpointNameResolution 521ffeee31efe056900fb5a1fe73007c179594e964f703625cf3272eb14983c0 frontend add_service:true sAliases:[frontend] tAliases:[3671840709bb]"
level=debug msg="addContainerNameResolution 521ffeee31efe056900fb5a1fe73007c179594e964f703625cf3272eb14983c0 frontend.1.1v9ggahd87x2ydlkna0qx7jmz"
level=debug msg="521ffeee31efe056900fb5a1fe73007c179594e964f703625cf3272eb14983c0 (w7aqg3s).addSvcRecords(frontend.1.1v9ggahd87x2ydlkna0qx7jmz, 10.0.1.47, <nil>, true) addServiceBinding sid:521ffeee31efe056900fb5a1fe73007c179594e964f703625cf3272eb14983c0"
level=debug msg="521ffeee31efe056900fb5a1fe73007c179594e964f703625cf3272eb14983c0 (w7aqg3s).addSvcRecords(tasks.frontend, 10.0.1.47, <nil>, false) addServiceBinding sid:n1wh16ru879699cpv3topcanc"
level=debug msg="521ffeee31efe056900fb5a1fe73007c179594e964f703625cf3272eb14983c0 (w7aqg3s).addSvcRecords(frontend, 10.0.1.32, <nil>, false) addServiceBinding sid:n1wh16ru879699cpv3topcanc"
level=debug msg="addServiceBinding from handleEpTableEvent END for frontend 521ffeee31efe056900fb5a1fe73007c179594e964f703625cf3272eb14983c0"
level=debug msg="Calling GET /v1.40/services/frontend?insertDefaults=false"
level=debug msg="error handling rpc" error="rpc error: code = NotFound desc = service frontend not found" rpc=/docker.swarmkit.v1.Control/GetService
...
level=debug msg="Calling GET /v1.40/services/frontend?insertDefaults=false"
level=debug msg="error handling rpc" error="rpc error: code = NotFound desc = service frontend not found" rpc=/docker.swarmkit.v1.Control/GetService
I have a docker image that I can run properly in my local VM. Everything runs fine.
I save the image and load it in the prod server.
I can see the image by using docker images
next I try to run it with docker run -p 9191:9191 myservice
It hangs and eventually times out.
The log shows the following:
time="2018-08-15T16:14:35.058232400-07:00" level=debug msg="HCSShim::CreateContainer succeeded id=b928213e42b2103cd33b676ed08a15529be10fffcfd7e709af86df5db79e4152 handle=
39044064"
time="2018-08-15T16:14:35.058232400-07:00" level=debug msg="libcontainerd: Create() id=b928213e42b2103cd33b676ed08a15529be10fffcfd7e709af86df5db79e4152, Calling start()"
time="2018-08-15T16:14:35.058232400-07:00" level=debug msg="libcontainerd: starting container b928213e42b2103cd33b676ed08a15529be10fffcfd7e709af86df5db79e4152"
time="2018-08-15T16:14:35.058232400-07:00" level=debug msg="HCSShim::Container::Start id=b928213e42b2103cd33b676ed08a15529be10fffcfd7e709af86df5db79e4152"
time="2018-08-15T16:18:25.393050900-07:00" level=debug msg="Result: {\"Error\":-2147023436,\"ErrorMessage\":\"This operation returned because the timeout period expired.\
"}"
time="2018-08-15T16:18:25.394050800-07:00" level=error msg="libcontainerd: failed to start container: container b928213e42b2103cd33b676ed08a15529be10fffcfd7e709af86df5db7
9e4152 encountered an error during Start: failure in a Windows system call: This operation returned because the timeout period expired. (0x5b4)"
time="2018-08-15T16:18:25.394050800-07:00" level=debug msg="HCSShim::Container::Terminate id=b928213e42b2103cd33b676ed08a15529be10fffcfd7e709af86df5db79e4152"
time="2018-08-15T16:18:25.394050800-07:00" level=debug msg="libcontainerd: cleaned up after failed Start by calling Terminate"
time="2018-08-15T16:18:25.394050800-07:00" level=error msg="Create container failed with error: container b928213e42b2103cd33b676ed08a15529be10fffcfd7e709af86df5db79e4152
encountered an error during Start: failure in a Windows system call: This operation returned because the timeout period expired. (0x5b4)"
time="2018-08-15T16:18:25.424053800-07:00" level=debug msg="attach: stdout: end"
time="2018-08-15T16:18:25.425055000-07:00" level=debug msg="attach: stderr: end"
time="2018-08-15T16:18:25.427054100-07:00" level=debug msg="Revoking external connectivity on endpoint boring_babbage (b20f403df0ed25ede9152f77eb0f8e049677f1279b68862a25b
b9e2ab94babfb)"
time="2018-08-15T16:18:25.459087300-07:00" level=debug msg="[DELETE]=>[/endpoints/31e66619-5b57-47f2-9256-bbba54510e3b] Request : "
time="2018-08-15T16:18:25.548068700-07:00" level=debug msg="Releasing addresses for endpoint boring_babbage's interface on network nat"
time="2018-08-15T16:18:25.548068700-07:00" level=debug msg="ReleaseAddress(172.25.224.0/20, 172.25.229.142)"
time="2018-08-15T16:18:25.561064000-07:00" level=debug msg="WindowsGraphDriver Put() id b928213e42b2103cd33b676ed08a15529be10fffcfd7e709af86df5db79e4152"
time="2018-08-15T16:18:25.561064000-07:00" level=debug msg="hcsshim::UnprepareLayer flavour 1 layerId b928213e42b2103cd33b676ed08a15529be10fffcfd7e709af86df5db79e4152"
time="2018-08-15T16:18:25.566074800-07:00" level=debug msg="hcsshim::UnprepareLayer succeeded flavour 1 layerId=b928213e42b2103cd33b676ed08a15529be10fffcfd7e709af86df5db7
9e4152"
time="2018-08-15T16:18:25.566074800-07:00" level=debug msg="hcsshim::DeactivateLayer Flavour 1 ID b928213e42b2103cd33b676ed08a15529be10fffcfd7e709af86df5db79e4152"
time="2018-08-15T16:18:25.668075600-07:00" level=debug msg="hcsshim::DeactivateLayer succeeded flavour=1 id=b928213e42b2103cd33b676ed08a15529be10fffcfd7e709af86df5db79e41
52"
I can see when it is trying to create the container and then it fails.
But why?
added more information
I finally found out how to check the server status for running container and I am getting this error message:
So it means the server doesn't have a network gateway?
How can I fix this problem?
Still keep looking
More information
I did delete all NAT and create a new one, so the online check passed now.
However, I still encounter other errors and can't run the image.
Something in the virtual network is wrong, I just can't find the right information to fix it.... :(
We are trying to set up a Docker repository in Nexus OSS (v3.3.2-02) in a Kubernetes cluster, and having issues logging in to it. We are intending to have a proxy set up for DockerHub, a private repo, and a group repo to tie the two together, using the below configurations
Hosted
Proxy
Group
giving us the following list:
But when I try to log in to the repository, it appears it's trying to forward me to a /v2 endpoint, which is throwing a 404 error:
> docker login -u <user> -p <pass> https://repo.myhost.com:443
Error response from daemon: login attempt to https://repo.myhost.com:443/v2/ failed with status: 404 Not Found
I would like to add that we have Maven and NPM repositories set up in this same instance and they're working, so it appears Nexus itself is OK, but there's something wrong with the Docker configuration.
I don't know why this request is trying to send me to the /v2 endpoint when trying to log in. What am I missing?
Docker requires very specific URL layout and does not allow for any context URL hence the need for Docker connectors to allow Docker client to connect to NXRM. Your screenshot shows you have configured Docker connector for your Docker hosted repository on port 444, but your terminal capture shows you're attempting to connect on port 443 which isn't your Docker connector port. The error message you have suggest your NXRM server indeed runs on port 443, but because of how Docker works you need to access it using port 444. Please try: docker login -u <user> -p <pass> https://repo.myhost.com:444 so it attempts to use your Docker connector port. Also, it's always a good idea to run the latest version of Nexus.
In an experiment I just ran (docker-machine, virtualbox, macOS), when the server was 1.13.1 (as was the docker cli), it made a graceful degradation from /v2 down to /v1, like so:
level=debug msg="Calling GET /_ping"
level=debug msg="Calling GET /v1.26/version"
level=debug msg="Calling GET /_ping"
level=debug msg="Calling GET /v1.26/version"
level=debug msg="Calling GET /_ping"
level=debug msg="Calling GET /v1.26/info"
level=debug msg="Calling POST /v1.26/auth"
level=debug msg="attempting v2 login to registry endpoint https://192.168.2.103:9999/v2/"
level=info msg="Error logging in to v2 endpoint, trying next endpoint: Get https://192.168.2.103:9999/v2/: EOF"
level=debug msg="attempting v1 login to registry endpoint https://192.168.2.103:9999/v1/"
level=info msg="Error logging in to v1 endpoint, trying next endpoint: Get https://192.168.2.103:9999/v1/users/: dial tcp 192.168.2.103:9999: getsockopt: connection refused"
level=error msg="Handler for POST /v1.26/auth returned error: Get https://192.168.2.103:9999/v1/users/: dial tcp 192.168.2.103:9999: getsockopt: connection refused"
but after I upgraded the server to 17.06.0-ce (still with 1.13.1 cli), it only attempted /v2 and then quit:
level=debug msg="Calling GET /_ping"
level=debug msg="Calling GET /_ping"
level=debug msg="Calling GET /v1.30/version"
level=debug msg="Calling GET /_ping"
level=debug msg="Calling GET /v1.30/info"
level=debug msg="Calling POST /v1.30/auth"
level=debug msg="attempting v2 login to registry endpoint https://192.168.2.103:9999/v2/"
level=info msg="Error logging in to v2 endpoint, trying next endpoint: Get https://192.168.2.103:9999/v2/: tls: oversized record received with length 21584"
level=error msg="Handler for POST /v1.30/auth returned error: Get https://192.168.2.103:9999/v2/: tls: oversized record received with length 21584"
So the answer appears to be that one either needs to teach Nexus to respond correctly to the /v2 endpoints (as it really should be doing already), or downgrade the dockerd back down to a version that speaks the /v1 api if that is the behavior you're after
Not sure if this is going to help, but the browser based URL does not have port number in it, and could login with my credentials. Example browser based URL below.
https://nexus.mysite.net/
However I had to key in the following
docker login -u -p https://nexus.mysite.net/
I am greeted with the following
Error response from daemon: login attempt to https://nexus.mysite.net/v2/ failed with status: 404 Not Found
Giving the right port number did not show up the above error and I could login from the CLI as follows.
docker login -u the-user-name -p the-password https://nexus.mysite.net:7000
(in my case the correct port number was 7000).
Hope this helps.
I've installed Docker Toolbox in Windows 8.1 and have been following the installation tutorial. When getting to the step where you create and push your own image, I got this error when I attempted to run docker login ... .
### VIA Docker Quickstart Terminal
### docker login --username=myuser --password="mypass" --email=myemail#gmail.com
time="2015-11-17T03:20:58.160803558Z" level=debug msg="Calling POST /v1.21/auth"
time="2015-11-17T03:20:58.160838971Z" level=info msg="POST /v1.21/auth"
time="2015-11-17T03:20:58.169033324Z" level=debug msg="hostDir: /etc/docker/certs.d/https:/registry-win-tp3.docker.io/v1"
time="2015-11-17T03:20:58.169071565Z" level=debug msg="pinging registry endpoint https://registry-win-tp3.docker.io/v1/"
time="2015-11-17T03:20:58.169084660Z" level=debug msg="attempting v1 ping for registry endpoint https://registry-win-tp3.docker.io/v1/"
time="2015-11-17T03:20:58.898542338Z" level=debug msg="Error unmarshalling the _ping PingResult: invalid character '<' looking for beginning of value"
time="2015-11-17T03:20:58.898803841Z" level=debug msg="PingResult.Version: \"\""
time="2015-11-17T03:20:58.898818084Z" level=debug msg="Registry standalone header: ''"
time="2015-11-17T03:20:58.898836197Z" level=debug msg="PingResult.Standalone: true"
time="2015-11-17T03:20:58.898853685Z" level=debug msg="attempting v1 login to registry endpoint https://registry-win-tp3.docker.io/v1/"
time="2015-11-17T03:20:59.478756938Z" level=error msg="Handler for POST /v1.21/auth returned error: Unexpected status code [403] : <html><body><h1>403 Forbidden</h1>\nRequest forbidden by administrative rules.\n</body></html>\n\n"
time="2015-11-17T03:20:59.478815334Z" level=error msg="HTTP Error" err="Unexpected status code [403] : <html><body><h1>403 Forbidden</h1>\nRequest forbidden by administrative rules.\n</body></html>\n\n" statusCode=500
Trying to solve the issue, I tried running docker login ... from within the Docker default VM. And there it works!
### VIA default virtual machine (192.168.99.100)
### docker login --username=myuser --password="mypass" --email=myemail#gmail.com https://index.docker.io/v1/
time="2015-11-17T03:20:46.053333255Z" level=debug msg="Calling POST /v1.21/auth"
time="2015-11-17T03:20:46.053404176Z" level=info msg="POST /v1.21/auth"
time="2015-11-17T03:20:46.082796012Z" level=debug msg="hostDir: /etc/docker/certs.d/https:/index.docker.io/v1"
time="2015-11-17T03:20:46.082930763Z" level=debug msg="pinging registry endpoint https://index.docker.io/v1/"
time="2015-11-17T03:20:46.082946790Z" level=debug msg="attempting v1 ping for registry endpoint https://index.docker.io/v1/"
time="2015-11-17T03:20:46.082959103Z" level=debug msg="attempting v1 login to registry endpoint https://index.docker.io/v1/"
I notice that they're using two different URLs and that the first one encounters a parsing error. The credentials are obviously correct since they work from within the VM, unless the two domains don't share users. Are the URLs or the response being mangled by MINGW64?
Update February 2016
PR 19891 "Enable cross-platforms login to Registry" is supposed to fixed the issue
Use a daemon-defined Registry URL for docker login.
This allows a Windows client interacting with a Linux daemon to properly use the default Registry endpoint instead of the Windows specific one.
It is in commit 19eaa71 (maybe for docker 1.10?)
This is reported both in docker/docker issue 15612 and docker/docker issue 18019
After some analysis of the source code I’ve detected that we have different registry URLs for Windows and UNIX.
Windows: https://registry-win-tp3.docker.io/v1/
Unix: https://index.docker.io/v1/
The Windows url comes from a recent PR 15417 with the comment:
// Currently it is a TEMPORARY link that allows Microsoft to continue
// development of Docker Engine for Windows.
So it is possible this url won't work (unless you are on a very recent Windows Server 2016)
There seems to be a workaround in docker/hub-feedback issues 473, which involves:
specifying the default index registry of docker io,
docker login --username=myuser --password=mypassword --email=myemail https://index.docker.io/v1/
WARNING: login credentials saved in C:\Users\myuser\.docker\config.json
Login Succeeded
modifying the config.json file created by the previous step, in order to add the same credentials for index.docker.io for the registry-win:
config.json:
{
"auths": {
"https://index.docker.io/v1/": {
"auth": "myhash",
"email": "myemail"
},
"https://registry-win-tp3.docker.io/v1/": {
"auth": "myhash",
"email": "mydomain"
}
}
}
After that, a docker push index.docker.io/myuser/myrepo:latest does work.