Running `docker stack deploy` on a local VM results in "No such image" error even though the image is on the public registry - docker

I'm trying to follow the Docker Get Started guide. Currently I'm at part 4. Everything up until the point
docker stack deploy -c docker-compose.yml getstartedlab
worked well. However, after trying to deploy the services, when I run docker stack ps getstartedlab, I see that the swarm manager keeps trying to restart the containers, since every time they get the error "No such image: username/get-st…" and have their state as "Rejected 6 seconds ago" etc.
I tried to search for solutions a bit but surprisingly it seems that nobody encountered this error before whatsoever. The issue here and a similar section in the Get Started guide talks about situations where one wants to pull from a private registry. However, throughout the tutorial I've been working with the default public registry. All previous steps (e.g. launching the swarm locally, without using virtualbox) worked fine.
Versions:
Docker version 18.02.0-ce, build fc4de447b5
Virtualbox 5.2.8 r120774
System Kernel: 4.14.25-1-MANJARO
Any idea what might have been the problem?

Surprisingly passing in the flag --with-registry-auth worked even though my repo is apparently on Docker Hub. Not sure what the problem was but maybe the claim that one would only need this flag if they're using a private registry is a bit inaccurate then.

Related

docker compose building and starting timeout

I have been using my docker compose for a while, but today it's giving me this error for the first time, when I want to start or build my compose:
An HTTP request took too long to complete. Retry with --verbose to obtain debug information.
If you encounter this issue regularly because of slow network conditions, consider setting COMPOSE_HTTP_TIMEOUT to a higher value (current value: 60).
Why this is happening?
I found similar questions like mine, but their solutions won't work for me. I tried restarting my docker and even removing images and containers but I still get this error.
Also my docker compose doesn't include tty variable.
COMPOSE_HTTP_TIMEOUT=300 docker-compose up -d
Please see Compose CLI environment variables for full details.

Testcontainers do not start after replacing Docker Desktop with minikube

I want to make my testcontainers in Java integration tests work with minikube replacing Docker Desktop.
I followed below article to get started:
https://www.atomicjar.com/2021/10/docker-on-windows-and-macos/#minikube
This is what I've got in testcontainers.properties
docker.client.strategy=org.testcontainers.dockerclient.EnvironmentAndSystemPropertyClientProviderStrategy
docker.host=tcp\://192.168.64.2\:2376
docker.cert.path=/Users/username/.minikube/certs
docker.tls.verify=true
Although my docker is up and running, I'm getting following exception:
Caused by: java.lang.IllegalStateException: Could not find a valid Docker environment. Please see logs and check configuration
Can anybody please suggest anything to make it working?
TA
If you are using gradle try -no-daemon flag to use a new daemon. Your old gradle daemon still using your previous testcontainers properties, also restart your IDE if you're running your build inside.
After restarting Minikube and Intellij editor, and updating testcontainer-bom to be the latest - from 1.15 to 1.16.2, I was able to pull some third-party docker images. This means docker is working now.
However, I'm still trying to find a way to work with local images (Other application docker images) for integration testing as it used to work with Docker Desktop.

Docker fails on changed GCP virtual machine?

I have a problem with Docker that seems to happen when I change the machine type of a Google Compute Platform VM instance. Images that were fine fail to run, fail to delete, and fail to pull, all with various obscure messages about missing keys (this on Linux), duplicate or missing layers, and others I don't recall.
The errors don't always happen. One that occurred just now, with an image that ran a couple hundred times yesterday on the same setup, though before a restart, was:
$ docker run --rm -it mbloore/model:conda4.3.1-aq0.1.9
docker: Error response from daemon: layer does not exist.
$ docker pull mbloore/model:conda4.3.1-aq0.1.9
conda4.3.1-aq0.1.9: Pulling from mbloore/model
Digest: sha256:4d203b18fd57f9d867086cc0c97476750b42a86f32d8a9f55976afa59e699b28
Status: Image is up to date for mbloore/model:conda4.3.1-aq0.1.9
$ docker rmi mbloore/model:conda4.3.1-aq0.1.9
Error response from daemon: unrecognized image ID sha256:8315bb7add4fea22d760097bc377dbc6d9f5572bd71e98911e8080924724554e
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
$
So it thinks it has no images, but the Docker folders are full of files, and it does know some hashes. It looks like some index has been damaged.
I restarted that instance, and then Docker seemed to be normal again without any special action on my part.
The only workarounds I have found so far are to restart and hope, or to delete several large Docker directories, and recreate them empty. Then after a restart and pull and run works again. But I'm now not sure that it always will.
I am running with Docker version 17.05.0-ce on Debian 9. My images were built with Docker version 17.03.2-ce on Amazon Linux, and are based on the official Ubuntu image.
Has anyone had this kind of problem, or know a way to reset the state of Docker without deleting almost everything?
Two points:
1) It seems that changing the VM had nothing to do with it. On some boots Docker worked, on others not, with no change in configuration or contents.
2) At Google's suggestion I installed Stackdriver monitoring and logging agents, and I haven't had a problem through seven restarts so far.
My first guess is that there is a race condition on startup, and adding those agents altered it in my favour. Of course, I'd like to have a real fix, but for now I don't have the time to pursue the problem.

Re-running docker-compose in Windows says network configuration changed

I have docker-compose version 1.11.2 on Windows and using a version 2.1 docker-compose.yml but whenever I try to run something like docker-compose up or docker-compose run a subsequent time, I get an error that the network needs to be recreated because configuration options changed (even if I didn't change anything). I can docker network rm to remove the network, but from other documentation and posts about docker-compose on Linux it seems this is unnecessary.
I can reproduce this reliably but can't really find any further information. Can anyone explain why I keep getting errors to recreate the network (using a transparent driver to download some stuff when building the image, but even using the nat driver gives me a similar error) or at least how to work around it? One of my scenarios is to be able to use docker-compose run on one of the services a couple of times on the same machine as part of cloud build/test.
Turns out this was a bug and was fixed in a subsequent update several weeks ago. I was told by one of the Docker developers that Windows 10 Creators Update was required as well.

Docker deployments fail on Marathon, work fine otherwise

I have been trying to deploy a docker container web based application on Mesos using Mesosphere Marathon.
I first tried deploying my Play Framework application which works fine when I launch it using the docker container. Then I also tried the example application mention on the Mesosphere website. Both fail inside marathon, but work fine when run as standalone docker images.
The application shows up as "Waiting" or "Deploying" in Marathon web UI while on Mesos it fails. I have made sure that the Mesos slave is running fine.
I believe that because the application fails on Mesos, Marathon tries to restart it which is why I get these status message almost always.
I have previously tried deploying the same application (without wrapping it inside the docker container) on Marathon (same installation) and it has worked fine. However, we really want to use Docker for our applications.
I have gone through plenty of tutorials and everything seems to be following the "rules". I don't understand what could be wrong.
Edit:
E1104 19:29:01.291219 4242 slave.cpp:3342] Container '9dbebe8c-5506-4f70-b560-34be39ecdc96' for executor 'mediator.30dbd1ed-82fc-11e5-b1d4-56847afe9799' of framework '64d39023-aad3-4fdc-8565-6d8e3ec9cb77-0000' failed to start: Failed to 'docker -H unix:///var/run/docker.sock pull devrep/message-mediator:latest': exit status = exited with status 1 stderr = Error: image devrep/message-mediator:latest not found
W1104 19:29:01.293334 4244 docker.cpp:1002] Ignoring updating unknown container: 9dbebe8c-5506-4f70-b560-34be39ecdc96
E1104 19:29:06.711524 4241 slave.cpp:3342] Container 'b7f8004a-2759-41ec-8169-61d04a7c4c3d' for executor 'mediator.343b027e-82fc-11e5-b1d4-56847afe9799' of framework '64d39023-aad3-4fdc-8565-6d8e3ec9cb77-0000' failed to start: Failed to 'docker -H unix:///var/run/docker.sock pull devrep/message-mediator:latest': exit status = exited with status 1 stderr = Error: image devrep/message-mediator:latest not found
Without an actual error message or the logs, it's hard to guess what your problem could be.
My first thought is that you should check whether your Mesos Slaves are started with the --containerizers=docker,mesos flag at all. If not, it can't work at all.
Also, if you're using a private registry, either make sure that Docker on your Mesos Slaves is either configured to use it, or follow the guidelines in the Marathon docs on how o use a private registry.
Can you do a docker pull devrep/message-mediator:latest on any Mesos Slave?
Also, see
https://github.com/mesosphere/marathon/issues/1781
I know its very late to answer it but might be helpful. Seeing your logs I find
devrep/message-mediator:latest
here latest is the tag name of your image, if you don't provide one in container docker image or leave it blank like below
"container": {
"type": "DOCKER",
"docker": {
"image": "devrep/message-mediator",
},
},
it automatically tries to pull the devrep/message-mediator:latest which I highly doubt will be present so try adding a tag name always e.g in my case it was v1
devrep/message-mediator:v1

Resources