docker push to docker hub and/or AWS ECR fails - docker

My OS: Ubuntu 20.04 LTS
Docker Version: 20.10.17, build 100c701
Symptoms: do a docker push to either docker hub or AWS ECR. 1 or more of the layers will fail and retry, fail and retry, fail and retry. Sometimes it looks like it's nearly done, sometimes it's sooner. Eventually the entire command will fail.
If I retry the command, the same problem layers remain problem layers.
If I rebuild my image, the problem may get better or worse. It does appear to move around with a different image build, but at one point I was pushing a base image (and it was failing), so I pushed a child image, and that push failed on the same layer as the base image while being peachy-keen with the other layers.
Web searches suggest fixes from 2020 or 2021, which certainly should be in the mainstream now, although perhaps both Amazon and Docker Hub are running ancient (and broken) versions.
Additional Info:
Tried from my Mac. Same failure.
ca1399d10d43: Layer already exists
b74197196d00: Layer already exists
2c9fd6cbb874: Retrying in 7 seconds
d79f7f0a3cf1: Layer already exists
36eb8e32aa2f: Layer already exists```
It's not an authentication problem. And it's really quite consistent -- some layers upload, some don't. So I don't see how it can be a network issue.

Related

Docker push fails when a file with whitespace in its name is included in the Docker image

I'm having problems with a custom Docker image in which there are some files having some whitespaces in their names. When I execute the docker push command I have this error:
$> docker push example.azurecr.io/myimage
The push refers to repository [example.azurecr.io/myimage]
ecaa33aa3064: Pushing [==================================================>] 59MB/59MB
3f06df57be30: Pushing [==================================================>] 21.31MB/21.31MB
ca31a9af4714: Layer already exists
09eb78ab1afc: Layer already exists
62386d2295bd: Layer already exists
f7afe9869eba: Layer already exists
e2eb06d8af82: Layer already exists
svm.runProcess: command cat /tmp/d2/app/wwwroot/fonts/FranziskaWeb W03 BlackItalic.ttf failed with exit code 1
I run the Docker engine on Windows Server 2019 with the Linux containers feature enabled.
Unfortunately I'm not able to write a Dockerfile that reproduces this error.
Someone else on the Internet got this same error but I found no solution. As far as you know, does Docker have any problem with pushing images containing files with whitespaces?
If you are not using the latest version, there is chance to update and fix it. It has been a problem, based on the comment on the code.
It seems to be fixed on 2019 February.
If you have already the latest version or you can't update it, then I fear that you are out of luck if you can't modify these specific files to not contain the whitespaces.
svm.runProcess seems to be part of LCOW image support, which is experimental and deprecated. See multiple pull-requests which removed this feature recently, and the initial one. Containers in Windows space are moving towards WSL2, however it also seems like, that now (July 2021) they are moving away from WSL support on Windows Servers, based on the GitHub comment:
"The Windows Desktop SKUs (where WSL 2 is supported) are recommended SKUs for these scenarios. For those who would like to use Linux in production scenarios in a server environment we recommend using server products such as Hyper-V VMs, and AKS on Azure Stack HCI. As always, we welcome any further Windows Server feedback through our Feedback Hub."
There is bigger issue explaining about the WSL 2 on Windows Servers, available on here.

docker image stuck at pulling fs layer

We are using a self hosted version of Jfrog Artifactory and for some reason Artifactory went down and which we were able to resolve by restarting the server. After reboot everything seemed to come back up correctly, however we realized that anything using a particular layer ddad3d7c1e96 which is actually pulling down alpine 3.11.11 from docker hub was now getting stuck at pulling fs layer and would just sit and keep retrying.
We tried "Zapping Caches" and running the maintenance option for Garbage Collection/Unused artifacts but we were still unable to download any image that had this particular layer.
Is it possible that this layer was somehow cached in artifactory and ended up being corrupted. We were also looked through the filestore and were unsuccessful in finding the blob.
After a few days, the issue resolved itself and images can now be pulled but we are now left without an explanation...
Does anyone have an idea of what could have happened and how we can clear out cached parent images which are not directly in our container registry but pulled from docker hub or another registry.
We ran into a similar issue - zapping caches worked for us however it took some time to complete

Docker fails on changed GCP virtual machine?

I have a problem with Docker that seems to happen when I change the machine type of a Google Compute Platform VM instance. Images that were fine fail to run, fail to delete, and fail to pull, all with various obscure messages about missing keys (this on Linux), duplicate or missing layers, and others I don't recall.
The errors don't always happen. One that occurred just now, with an image that ran a couple hundred times yesterday on the same setup, though before a restart, was:
$ docker run --rm -it mbloore/model:conda4.3.1-aq0.1.9
docker: Error response from daemon: layer does not exist.
$ docker pull mbloore/model:conda4.3.1-aq0.1.9
conda4.3.1-aq0.1.9: Pulling from mbloore/model
Digest: sha256:4d203b18fd57f9d867086cc0c97476750b42a86f32d8a9f55976afa59e699b28
Status: Image is up to date for mbloore/model:conda4.3.1-aq0.1.9
$ docker rmi mbloore/model:conda4.3.1-aq0.1.9
Error response from daemon: unrecognized image ID sha256:8315bb7add4fea22d760097bc377dbc6d9f5572bd71e98911e8080924724554e
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
$
So it thinks it has no images, but the Docker folders are full of files, and it does know some hashes. It looks like some index has been damaged.
I restarted that instance, and then Docker seemed to be normal again without any special action on my part.
The only workarounds I have found so far are to restart and hope, or to delete several large Docker directories, and recreate them empty. Then after a restart and pull and run works again. But I'm now not sure that it always will.
I am running with Docker version 17.05.0-ce on Debian 9. My images were built with Docker version 17.03.2-ce on Amazon Linux, and are based on the official Ubuntu image.
Has anyone had this kind of problem, or know a way to reset the state of Docker without deleting almost everything?
Two points:
1) It seems that changing the VM had nothing to do with it. On some boots Docker worked, on others not, with no change in configuration or contents.
2) At Google's suggestion I installed Stackdriver monitoring and logging agents, and I haven't had a problem through seven restarts so far.
My first guess is that there is a race condition on startup, and adding those agents altered it in my favour. Of course, I'd like to have a real fix, but for now I don't have the time to pursue the problem.

gcloud docker push hanging

When I try to push new docker images to gcr.io using gcloud docker push, it frequently makes some progress before stalling out:
$ gcloud docker push gcr.io/foo-bar-1225/baz-quux:2016-03-23
The push refers to a repository [gcr.io/foo-bar-1225/baz-quux]
762ab2ceaa70: Pushing [> ] 556 kB/154.4 MB
2220ee6c7534: Pushing [===> ] 4.82 MB/66.11 MB
f99917176817: Layer already exists
8c1b4a49167b: Layer already exists
5f70bf18a086: Layer already exists
1967867932fe: Layer already exists
6b4fab929601: Layer already exists
550f16cd8ed1: Layer already exists
44267ec3aa94: Layer already exists
bd750002938c: Layer already exists
917c0fc99b35: Layer already exists
The push stays in this state indefinitely (I've left it for an hour without a byte of progress). If I Ctrl-C kill this process and rerun it, it gets to the exact same point and again makes no progress.
The only workaround I've found is to restart my computer and re-run "Docker Quickstart Terminal". Then the push succeeds.
Is there a workaround for stalled pushes that doesn't require frequently rebooting my computer? (I'm on Mac OS X.)
This seems to be an issue to docker users on Mac have ran into previously, as can be seen in this docker thread, https://github.com/docker/docker/issues/5113
While there is no clear fix, a slightly better workaround is to restart docker machine rather than your computer each time.
You can run docker-machine restart default to reset docker to a working state.
Hope that helps.

Docker hub image cache doesn't seem to be working

We have a continuous integration pipeline on circleci that does the following:
Loads repo/image:mytag1 from the cache directory to be able to use cached layers
Builds a new version: docker build -t repoimage:mytag2
Saves the new version to the cache directory with docker save
Runs tests
Pushes to docker hub: docker push repo/image:mytag2
The problem is with step 5. The push step takes 5 minutes every time. If I understand it correctly, docker hub is meant to cache layers so we don't have to re-push things like the base image and dependencies if they are not updated.
I ran the build twice in a row, and I see a lot of crossover in the hash of the layers being pushed. Yet rather than "Image already exists" I see "Image successfully pushed".
Here's the output of build 1's docker push, and here's build 2
If you diff those two files you'll see that only 2 layers differ in each build:
< ca44fed88be6: Buffering to Disk
< ca44fed88be6: Image successfully pushed
< 5dbd19bfac8a: Buffering to Disk
< 5dbd19bfac8a: Image successfully pushed
---
> 9136b10cfb72: Buffering to Disk
> 9136b10cfb72: Image successfully pushed
> 0388311b6857: Buffering to Disk
> 0388311b6857: Image successfully pushed
So why is it that all the images have to re-push every time?
Using a different tag creates a different image which, when pushed, cannot rely on the cache.
For example the two commands:
$ docker commit -m "thing" -a "me" db65bf421f96 me/thing:v1
$ docker commit -m "thing" -a "me" db65bf421f96 me/thing:v2
yield utterly distinctimages even though they were created from identical images (db65bf421f96). When pushed, dockerhub must treat them as completely separate images as can be seen with:
$ docker images
REPOSITORY TAG IMAGE ID
me/thing v2 f14aa8ac6bae
me/thing v1 c7d72ccc1d71
The image IDs are unique and thus the images are unique even only if they vary in tags.
You could say "docker should recognize them as being bit for bit identical" and thus treat them as cachable. But it doesn't (yet).
The only surprise for me in your example is that you got any duplicate image IDs at all.
Authoritative (if less explanatory) documentation can be found at docker in "Build your own images".
The process should work as you described. In fact we're building all of our images in this way without problems. Usually there are just a few changes to the topmost layers and only those are pushed to the registry - otherwise the whole concept of image layers would be useless.
See here for an example. Only the two topmost layers have changed, are pushed for :latest and for :4.0.2 there's no push at all. We're tagging images with git tags and for some projects we even tag images with git describe - to get the rollback functionality, just in case.
You can get the project source-code also from GitHub to try it out.
A few things to note about the setup: We're using a self-hosted GitLab CI with a customized runner which runs docker and docker-compose on an isolated host with Docker 1.9.1, but that should not make any difference.
There may be also differences in the registry version, I had the feeling (but I am not 100% sure) that some older repos on DockerHub are still running on registry v1, newer ones always on v2 - so you may try creating a new repo and see if the issue still occurs.
Please note that the behavior for tags described above does only apply when pushing the same image-name, if you push the same image layers with another name, you always need to push all layers, despite the fact that all layers should already exists on the registry, so I guess repo/image:mytag1 and repoimage:mytag2 actually go to repo/image and the missing slash is just a typo.
Another cause could be that your images are built on different hosts on Circle CI, but then you should also get different layer IDs, so I think this is not very likely.
I suggest to build an image manually and try to reproduce the problem or contact Circle CI about this issue.

Resources