Dockerhub: listing all available versions of a given image? - docker

I'm looking for a way to list all publicly available versions of an image from Dockerhub. Is there a way this could be achieved?
Specifically, I'm interested in the openjdk:8-jdk-alpine images.
Dockerhub typically only lists the latest version of each image, and there are no linking to historic versions. For openjdk, it's currently 8u191-jdk-alpine3.8:
However, it possible to pull older versions if we know their image digest ID:
openjdk:8-jdk-alpine#sha256:1fd5a77d82536c88486e526da26ae79b6cd8a14006eb3da3a25eb8d2d682ccd6
openjdk:8-jdk-alpine#sha256:c5c705b462abab858066d412b3f871865684d8f837571c98b68e78c505dc7549
With some luck, I was able to find these digests for OpenJDK 8 (Java versions 1.8.0_171 and 1.8.0_151 respectively), by googling openjdk8 alpine digest and looking at github tickets, which included the image digest.
But, is there a systematic way for listing all publicly available digests?
Looking at docker search documentation, there's doesn't seem to be an option for listing the image version, only search by name.

You don't need digests to pull "old" images, you would rather use their tags (even if they are not displayed in Docker Hub).
I use the following command to retrieve tags of a particular image, parsing the output of https://registry.hub.docker.com/v1/repositories/$REPOSITORY/tags :
REPOSITORY=openjdk # can be "<registry>/<image_name>" ("google/cloud-sdk" for example)
wget -q https://registry.hub.docker.com/v1/repositories/$REPOSITORY/tags -O - | \
jq -r '.[].name'
Result for REPOSITORY=openjdk (1593 tags at the time of writing) looks like :
latest
10
10-ea
10-ea-32
10-ea-32-experimental
10-ea-32-jdk
10-ea-32-jdk-experimental
10-ea-32-jdk-slim
10-ea-32-jdk-slim-experimental
10-ea-32-jre
[...]
If you can't/don't want to install jq (tool to manipulate JSON), then you could use :
wget -q https://registry.hub.docker.com/v1/repositories/$REPOSITORY/tags -O - | \
sed -e 's/[][]//g' -e 's/"//g' -e 's/ //g' | \
tr '}' '\n' | \
awk -F: '{print $3}'
(I'm pretty sure I got this command from another question, but I can't find where)
You can of course filter the output of this command and keep only tags you're interested in :
wget -q https://registry.hub.docker.com/v1/repositories/$REPOSITORY/tags -O - | \
jq -r '.[].name | select(match("^8.*jdk-alpine"))'
or :
wget -q https://registry.hub.docker.com/v1/repositories/$REPOSITORY/tags -O - | \
jq -r '.[].name' \
grep -E '^8.*jdk-alpine'

Related

Docker caching for travis builds

Docker caching is not yet available on travis: https://github.com/travis-ci/travis-ci/issues/5358
I'm trying to write a workaround by doing:
`docker save -o file.tar $(docker history -q image_name | grep -v missing)`
`docker load -i file.tar
Which works great, gives me all the image layers back. My only problem now is the saving takes a long time, and most of the time I'm actually changing one layer, so I don't need to rewrite all the rest. Is there a way of telling the docker save command to skip layers already in file.tar?
In the manifest.json file inside the tar you have the information you need.
tar -xOf file.tar manifest.json
Check the value of the Config keys. The first 12 characters are the image id. You can use the command above, extract the image ids that you already have, and exclude them in your docker save command.
I'm not very good with bash scripting, but this works on my mac
tar -xOf file.tar manifest.json | tr , '\n' | grep -o '"Config":".*"' | awk -F ':' '{print $2}' | awk '{print substr($0,2,12)}'
Using this outputs everything
docker history -q IMAGE_HERE | grep -v missing && tar -xOf file.tar manifest.json | tr , '\n' | grep -o '"Config":".*"' | awk -F ':' '{print $2}' | awk '{print substr($0,2,12)}'
After this you only need to get the unique values. This could be done with sort and uniq -u, but for some reason, sort doesn't work as expected. This command assumes the presence of file.tar so take that into consideration too.
I couldn't find anything about append in the docker save command. The above strategy could work with multiple file tars that are all different with each other.

How to get a list of images on docker registry v2

I'm using docker registry v1 and I'm interested in migrating to the newer version, v2. But I need some way to get a list of images present on registry; for example with registry v1 I can execute a GET request to http://myregistry:5000/v1/search? and the result is:
{
"num_results": 2,
"query": "",
"results": [
{
"description": "",
"name": "deis/router"
},
{
"description": "",
"name": "deis/database"
}
]
}
But I can't find on official documentation something similar to get a list of image on registry. Anybody knows a way to do it on new version v2?
For the latest (as of 2015-07-31) version of Registry V2, you can get this image from DockerHub:
docker pull distribution/registry:master
List all repositories (effectively images):
curl -X GET https://myregistry:5000/v2/_catalog
> {"repositories":["redis","ubuntu"]}
List all tags for a repository:
curl -X GET https://myregistry:5000/v2/ubuntu/tags/list
> {"name":"ubuntu","tags":["14.04"]}
If the registry needs authentication you have to specify username and password in the curl command
curl -X GET -u <user>:<pass> https://myregistry:5000/v2/_catalog
curl -X GET -u <user>:<pass> https://myregistry:5000/v2/ubuntu/tags/list
you can search on
http://<ip/hostname>:<port>/v2/_catalog
Get catalogs
Default, registry api return 100 entries of catalog, there is the code:
When you curl the registry api:
curl --cacert domain.crt https://your.registry:5000/v2/_catalog
it equivalents with:
curl --cacert domain.crt https://your.registry:5000/v2/_catalog?n=100
This is a pagination methond.
When the sum of entries beyond 100, you can do in two ways:
First: give a bigger number
curl --cacert domain.crt https://your.registry:5000/v2/_catalog?n=2000
Second: parse the next linker url
curl --cacert domain.crt https://your.registry:5000/v2/_catalog
A link element contained in response header:
curl --cacert domain.crt https://your.registry:5000/v2/_catalog
response header:
Link: </v2/_catalog?last=pro-octopus-ws&n=100>; rel="next"
The link element have the last entry of this request, then you can request the next 'page':
curl --cacert domain.crt https://your.registry:5000/v2/_catalog?last=pro-octopus-ws
If the response header contains link element, you can do it in a loop.
Get Images
When you get the result of catalog, it like follows:
{
"repositories": [
"busybox",
"ceph/mds"
]
}
you can get the images in every catalog:
curl --cacert domain.crt https://your.registry:5000/v2/busybox/tags/list
returns:
{"name":"busybox","tags":["latest"]}
The latest version of Docker Registry available from https://github.com/docker/distribution supports Catalog API. (v2/_catalog). This allows for capability to search repositories
If interested, you can try docker image registry CLI I built to make it easy for using the search features in the new Docker Registry distribution (https://github.com/vivekjuneja/docker_registry_cli)
This has been driving me crazy, but I finally put all the pieces together. As of 1/25/2015, I've confirmed that it is possible to list the images in the docker V2 registry ( exactly as #jonatan mentioned, above. )
I would up-vote that answer, if I had the rep for it.
Instead, I'll expand on the answer. Since registry V2 is made with security in mind, I think it's appropriate to include how to set it up with a self signed cert, and run the container with that cert in order that an https call can be made to it with that cert:
This is the script I actually use to start the registry:
sudo docker stop registry
sudo docker rm -v registry
sudo docker run -d \
-p 5001:5001 \
-p 5000:5000 \
--restart=always \
--name registry \
-v /data/registry:/var/lib/registry \
-v /root/certs:/certs \
-e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/domain.crt \
-e REGISTRY_HTTP_TLS_KEY=/certs/domain.key \
-e REGISTRY_HTTP_DEBUG_ADDR=':5001' \
registry:2.2.1
This may be obvious to some, but I always get mixed up with keys and certs. The file that needs to be referenced to make the call #jonaton mentions above**, is the domain.crt listed above. ( Since I put domain.crt in /root, I made a copy into the user directory where it could be accessed. )
curl --cacert ~/domain.crt https://myregistry:5000/v2/_catalog
> {"repositories":["redis","ubuntu"]}
**The command above has been changed: -X GET didn't actually work when I tried it.
Note: https://myregistry:5000 ( as above ) must match the domain given to the cert generated.
We wrote a CLI tool for this purpose: docker-ls It allows you to browse a docker registry and supports authentication via token or basic auth.
Here is a nice little one liner (uses JQ) to print out a list of Repos and associated tags.
If you dont have jq installed you can use: brew install jq
# This is my URL but you can use any
REPO_URL=10.230.47.94:443
curl -k -s -X GET https://$REPO_URL/v2/_catalog \
| jq '.repositories[]' \
| sort \
| xargs -I _ curl -s -k -X GET https://$REPO_URL/v2/_/tags/list
Install registry:2.1.1 or later (you can check the last one, here) and use GET /v2/_catalog to get list.
https://github.com/docker/distribution/blob/master/docs/spec/api.md#listing-repositories
Lista all images by Shell script example:
https://gist.github.com/OndrejP/a2386d08e5308b0776c0
I had to do the same here and the above works except I had to provide login details as it was a local docker repository.
It is as per the above but with supplying the username/password in the URL.
curl -k -X GET https://yourusername:yourpassword#theregistryURL/v2/_catalog
It comes back as unformatted JSON.
I piped it through the python formatter for ease of human reading, in case you would like to have it in this format.
curl -k -X GET https://yourusername:yourpassword#theregistryURL/v2/_catalog | python -m json.tool
Here's an example that lists all tags of all images on the registry. It handles a registry configured for HTTP Basic auth too.
THE_REGISTRY=localhost:5000
# Get username:password from docker configuration. You could
# inject these some other way instead if you wanted.
CREDS=$(jq -r ".[\"auths\"][\"$THE_REGISTRY\"][\"auth\"]" .docker/config.json | base64 -d)
curl -s --user $CREDS https://$THE_REGISTRY/v2/_catalog | \
jq -r '.["repositories"][]' | \
xargs -I #REPO# curl -s --user $CREDS https://$THE_REGISTRY/v2/#REPO#/tags/list | \
jq -M '.["name"] + ":" + .["tags"][]'
Explanation:
extract username:password from .docker/config.json
make a https request to the registry to list all "repositories"
filter the json result to a flat list of repository names
for each repository name:
make a https request to the registry to list all "tags" for that "repository"
filter the stream of result json objects, printing "repository":"tag" pairs for each tag found in each repository
Using "/v2/_catalog" and "/tags/list" endpoints you can't really list all the images. If you pushed a few different images and tagged them "latest" you can't really list the old images! You can still pull them if you refer to them using digest "docker pull ubuntu#sha256:ac13c5d2...". So the answer is - there is no way to list images you can only list tags which is not the same
I wrote an easy-to-use command line tool for listing images in various ways (like list all images, list all tags of those images, list all layers of those tags).
It also allows you to delete unused images in various ways, like delete only older tags of a single image or from all images etc. This is convenient when you are filling your registry from a CI server and want to keep only latest/stable versions.
It is written in python and does not need you to download bulky big custom registry images.
If some on get this far.
Taking what others have already said above. Here is a one-liner that puts the answer into a text file formatted, json.
curl "http://mydocker.registry.domain/v2/_catalog?n=2000" | jq . - > /tmp/registry.lst
This looks like
{
"repositories": [
"somerepo/somecontiner",
"somerepo_other/someothercontiner",
...
]
}
You might need to change the `?n=xxxx' to match how many containers you have.
Next is a way to automatically remove old and unused containers.
This threads dates back a long time, the most recents tools that one should consider are skopeo and crane.
skopeo supports signing and has many other features, while crane is a bit more minimalistic and I found it easier to integrate with in a simple shell script.
Docker search registry v2 functionality is currently not supported at the time of this writing. See discussion since Feb 2015: "propose registry search functionality #206" https://github.com/docker/distribution/issues/206
I wrote a script, view-private-registry, that you can find: https://github.com/BradleyA/Search-docker-registry-v2-script.1.0
It is not pretty but it gets the information needed from the private registry.
Example of output from view-private-registry:
$ view-private-registry`
busybox:latest
gcr.io/google_containers/etcd:2.0.9
gcr.io/google_containers/hyperkube:v0.21.2
gcr.io/google_containers/pause:0.8.0
google/cadvisor:latest
jenkins:latest
logstash:latest
mongo:latest
nginx:latest
python:2.7
redis:latest
registry:2.1.1
stackengine/controller:latest
tomcat:7
tomcat:latest
ubuntu:14.04.2
Number of images: 16
Disk space used: 1.7G /mnt/three/docker-registry/registry-data
One liner bash to list all images with their tags:
curl --user user:pass https://myregistry.com/v2/_catalog | jq .repositories | sed -n 's/[ ",]//gp' | xargs -L1 -IIMAGE curl -s --user user:pass https://myregistry.com/v2/IMAGE/tags/list | jq '. as $parent | .tags[] | $parent.name + ":" + . '
Two lines to search for something in the image name:
search=my_container_part_name
curl --user user:pass https://registry.medworx.io/v2/_catalog | jq .repositories | sed -n '/'"$search"'/{s/[ ",]//gp;}' | xargs -L1 -IIMAGE curl -s --user user:pass https://registry.medworx.io/v2/IMAGE/tags/list | jq '. as $parent | .tags[] | $parent.name + ":" + . '
replace: user, pass and myregistry.com accordingly
uses curl, sed, xargs and jq and is hard to understand... but it does the job. It produces one call per image + 1.
If you can ssh or attach to the docker registry container, just browse the filesystem to look for things you want, like:
kubectl exec -it docker-registry-0 -- /bin/sh
ls /var/lib/registry/docker/registry/v2/repositories
ls /var/lib/registry/docker/registry/v2/repositories/busybox/_manifests/tags/
Since each registry runs as a container the container ID has an associated log file ID-json.log this log file contains the vars.name=[image] and vars.reference=[tag]. A script can be used to extrapolate and print these. This is perhaps one method to list images pushed to registry V2-2.0.1.
If your use-case is identifying only SIGNED and TRUSTED images for production, then this method is handy.
It parses a docker image repo for all SIGNED tags and strips away all the JSON formatting, puking-out only clean image tags. Which of course can be processed further according to your requirements.
Format of Command:
docker trust inspect imageName | grep "SignedTag" | awk -F'"' '{print $4}'
Examples using the nginx & Bitnami Docker repos:
docker trust inspect nginx | grep "SignedTag" | awk -F'"' '{print $4}'
docker trust inspect bitnami/java | grep "SignedTag" | awk -F'"' '{print $4}'
If there are no signed images then No signatures or cannot access imageName will be returned.
Example of a repo WITHOUT signed images (at the time of this writing) using the Wordpress Docker repo:
docker trust inspect wordpress | grep "SignedTag" | awk -F'"' '{print $4}'
If you want a nice web interface to your registry you can use this registry-browser docker image. This is useful if you just want to look around your registry, different repositories and tags.
If, the accepted answer here only returns a blank line, it is likely because of your ssl/tls cert on your registry server. Use the --insecure flag:
curl --insecure https://<registryHostnameOrIP>:5000/v2/_catalog

How to delete images from a private docker registry?

I run a private docker registry, and I want to delete all images but the latest from a repository. I don't want to delete the entire repository, just some of the images inside it. The API docs don't mention a way to do this, but surely it's possible?
Currently you cannot use the Registry API for that task. It only allows you to delete a repository or a specific tag.
In general, deleting a repository means, that all the tags associated to this repo are deleted.
Deleting a tag means, that the association between an image and a tag is deleted.
None of the above will delete a single image. They are left on your disk.
Workaround
For this workaround you need to have your docker images stored locally.
A workaround for your solution would be to delete all but the latest tags and thereby potentially removing the reference to the associated images. Then you can run this script to remove all images, that are not referenced by any tag or the ancestry of any used image.
Terminology (images and tags)
Consider an image graph like this where the capital letters (A, B, ...) represent short image IDs and <- means that an image is based on another image:
A <- B <- C <- D
Now we add tags to the picture:
A <- B <- C <- D
| |
| <version2>
<version1>
Here, the tag <version1> references the image C and the tag <version2> references the image D.
Refining your question
In your question you said that you wanted to remove
all images but the latest
. Now, this terminology is not quite correct. You've mixed images and tags. Looking at the graph I think you would agree that the tag <version2> represents the latest version. In fact, according to this question you can have a tag that represents the latest version:
A <- B <- C <- D
| |
| <version2>
| <latest>
<version1>
Since the <latest> tag references image D I ask you: do you really want to delete all but image D? Probably not!
What happens if you delete a tag?
If you delete the tag <version1> using the Docker REST API you will get this:
A <- B <- C <- D
|
<version2>
<latest>
Remember: Docker will never delete an image! Even if it did, in this case it cannot delete an image, since the image C is part of the ancestry for the image D which is tagged.
Even if you use this script, no image will be deleted.
When an image can be deleted
Under the condition that you can control when somebody can pull or push to your registry (e.g. by disabling the REST interface). You can delete an image from an image graph if no other image is based on it and no tag refers to it.
Notice that in the following graph, the image D is not based on C but on B. Therefore, D doesn't depend on C. If you delete tag <version1> in this graph, the image C will not be used by any image and this script can remove it.
A <- B <--------- D
\ |
\ <version2>
\ <latest>
\ <- C
|
<version1>
After the cleanup your image graph looks like this:
A <- B <- D
|
<version2>
<latest>
Is this what you want?
I've faced the same problem with my registry, then I tried the solution listed below from a blog page. It works.
Do note, the deletion must be enabled for it to work. You can do it by providing a custom config, or by setting REGISTRY_STORAGE_DELETE_ENABLED=true.
Step 1: List the repositories
$ curl -sS <domain-on-ip>:5000/v2/_catalog
The response will be in the following format:
{
"repositories": [
<repo>,
...
]
}
Step 2: List the repository tags
$ curl -sS <domain-on-ip>:5000/v2/<repo>/tags/list
The response will be in the following format:
{
"name": <repo>,
"tags": [
<tag>,
...
]
}
Step 3: Determine the digest of the target tag
$ curl -sS -H 'Accept: application/vnd.docker.distribution.manifest.v2+json' \
-o /dev/null \
-w '%header{Docker-Content-Digest}' \
<domain-or-ip>:5000/v2/<repo>/manifests/<tag>
Do note, the Accept header is needed here. Without it you'll get a different value and the deletion will fail.
The response will be in the following format:
sha256:6de813fb93debd551ea6781e90b02f1f93efab9d882a6cd06bbd96a07188b073
Step 4: Delete the manifest
$ curl -sS -X DELETE <domain-or-ip>:5000/v2/<repo>/manifests/<digest>
Step 5: Garbage collect the image
Run this command in your docker registry container:
$ registry garbage-collect /etc/docker/registry/config.yml
Here is my config.yml:
version: 0.1
log:
fields:
service: registry
storage:
cache:
blobdescriptor: inmemory
filesystem:
rootdirectory: /var/lib/registry
delete:
enabled: true
http:
addr: :5000
headers:
X-Content-Type-Options: [nosniff]
health:
storagedriver:
enabled: true
interval: 10s
threshold: 3
The current v2 registry now supports deleting via DELETE /v2/<name>/manifests/<reference>.
See: https://github.com/docker/distribution/blob/master/docs/spec/api.md#deleting-an-image
The <reference> can be taken from the Docker-Content-Digest header of a GET /v2/<name>/manifests/<tag> request (do note that the Accept: application/vnd.docker.distribution.manifest.v2+json header is needed for this request).
A script that makes use of it: https://github.com/byrnedo/docker-reg-tool
For it to work deletion must be enabled (REGISTRY_STORAGE_DELETE_ENABLED=true).
And to really free the disk space you need to run garbage collection.
Problem 1
You mentioned it was your private docker registry, so you probably need to check Registry API instead of Hub registry API doc, which is the link you provided.
Problem 2
docker registry API is a client/server protocol, it is up to the server's implementation on whether to remove the images in the back-end. (I guess)
DELETE /v1/repositories/(namespace)/(repository)/tags/(tag*)
Detailed explanation
Below I demo how it works now from your description as my understanding for your questions.
I run a private docker registry.
I use the default one, and listen on port 5000.
docker run -d -p 5000:5000 registry
Then I tag the local image and push into it.
$ docker tag ubuntu localhost:5000/ubuntu
$ docker push localhost:5000/ubuntu
The push refers to a repository [localhost:5000/ubuntu] (len: 1)
Sending image list
Pushing repository localhost:5000/ubuntu (1 tags)
511136ea3c5a: Image successfully pushed
d7ac5e4f1812: Image successfully pushed
2f4b4d6a4a06: Image successfully pushed
83ff768040a0: Image successfully pushed
6c37f792ddac: Image successfully pushed
e54ca5efa2e9: Image successfully pushed
Pushing tag for rev [e54ca5efa2e9] on {http://localhost:5000/v1/repositories/ubuntu/tags/latest}
After that I can use Registry API to check it exists in your private docker registry
$ curl -X GET localhost:5000/v1/repositories/ubuntu/tags
{"latest": "e54ca5efa2e962582a223ca9810f7f1b62ea9b5c3975d14a5da79d3bf6020f37"}
Now I can delete the tag using that API !!
$ curl -X DELETE localhost:5000/v1/repositories/ubuntu/tags/latest
true
Check again, the tag doesn't exist in my private registry server
$ curl -X GET localhost:5000/v1/repositories/ubuntu/tags/latest
{"error": "Tag not found"}
This is really ugly but it works, text is tested on registry:2.5.1.
I did not manage to get delete working smoothly even after updating configuration to enable delete. The ID was really difficult to retrieve, had to login to get it, maybe some misunderstanding. Anyway, the following works:
Enter the container
docker exec -it registry sh
Define variables matching your container and container version:
export NAME="google/cadvisor"
export VERSION="v0.24.1"
Move to the the registry directory:
cd /var/lib/registry/docker/registry/v2
Delete files related to your hash:
find . | grep `ls ./repositories/$NAME/_manifests/tags/$VERSION/index/sha256`| xargs rm -rf $1
Delete manifests:
rm -rf ./repositories/$NAME/_manifests/tags/$VERSION
Logout
exit
Run the GC:
docker exec -it registry bin/registry garbage-collect /etc/docker/registry/config.yml
If all was done properly some information about deleted blobs is shown.
There are some clients (in Python, Ruby, etc) which do exactly that. For my taste, it isn't sustainable to install a runtime (e.g. Python) on my registry server, just to housekeep my registry!
So deckschrubber is my solution:
go install github.com/fraunhoferfokus/deckschrubber#latest
$GOPATH/bin/deckschrubber
images older than a given age are automatically deleted. Age can be specified using -year, -month, -day, or a combination of them:
$GOPATH/bin/deckschrubber -month 2 -day 13 -registry http://registry:5000
UPDATE: here's a short introduction on deckschrubber.
The requirement to delete all tags except latest gets complicated because the same image manifest can be pointed to by multiple tags, so when you delete a manifest for one tag, you may effectively delete multiple tags.
There are a few options to make that workable. One is to track the digest for the latest tag and only delete manifests for other digests, or you can use some different API calls to delete the tags themselves.
Regardless of how you implement this, first your registry needs to be configured to allow the delete API's. With the minimal registry:2 image, that involves starting it with an environment variable REGISTRY_STORAGE_DELETE_ENABLED=true (or the equivalent yaml config).
Then for a simple script to loop through the tags and delete, there's:
#!/bin/sh
repo="localhost:5000/repo/to/prune"
for tag in $(regctl tag ls $repo); do
if [ "$tag" != "latest" ]; then
echo "Deleting: $(regctl image digest --list "${repo}:${tag}") [$tag]"
regctl tag rm "${repo}:${tag}"
fi
done
The regctl command used here comes from regclient and the regctl tag rm logic first attempts to perform the tag delete API added recently to the distribution-spec. Since most registries haven't implemented that spec, it falls back to the manifest delete API, but it first creates a dummy manifest to overwrite the tag, and then deletes that new digest. In doing so, if the old manifest was in use by other tags, it doesn't delete those other tags.
An alternative version of the script that deletes manifests except those pointing to the latest digest looks like:
#!/bin/sh
repo="localhost:5000/repo/to/prune"
save="$(regctl image digest --list "${repo}:latest")"
for tag in $(regctl tag ls $repo); do
digest="$(regctl image digest --list "${repo}:${tag}")"
if [ "$digest" != "$save" ]; then
echo "Deleting: $digest [$tag]"
regctl manifest rm "${repo}#${digest}"
fi
done
If you find yourself needing to create a deletion policy to automate the deleting of lots of images, I'd recommend looking at regclient/regbot from the same repo which allows you to define that policy and leave it running to continuously prune your registry.
Once the images have been deleted, you'll need to garbage collect your registry in most use cases. For example with the registry:2 image that looks like:
docker exec registry /bin/registry garbage-collect \
/etc/docker/registry/config.yml --delete-untagged
Briefly;
1) You must typed following command for RepoDigests of a docker repo;
## docker inspect <registry-host>:<registry-port>/<image-name>:<tag>
> docker inspect 174.24.100.50:8448/example-image:latest
[
{
"Id": "sha256:16c5af74ed970b1671fe095e063e255e0160900a0e12e1f8a93d75afe2fb860c",
"RepoTags": [
"174.24.100.50:8448/example-image:latest",
"example-image:latest"
],
"RepoDigests": [
"174.24.100.50:8448/example-image#sha256:5580b2110c65a1f2567eeacae18a3aec0a31d88d2504aa257a2fecf4f47695e6"
],
...
...
${digest} =
sha256:5580b2110c65a1f2567eeacae18a3aec0a31d88d2504aa257a2fecf4f47695e6
2) Use registry REST API
##curl -u username:password -vk -X DELETE registry-host>:<registry-port>/v2/<image-name>/manifests/${digest}
>curl -u example-user:example-password -vk -X DELETE http://174.24.100.50:8448/v2/example-image/manifests/sha256:5580b2110c65a1f2567eeacae18a3aec0a31d88d2504aa257a2fecf4f47695e6
You should get a 202 Accepted for a successful invocation.
3-) Run Garbage Collector
docker exec registry bin/registry garbage-collect --dry-run /etc/docker/registry/config.yml
registry — registry container name.
For more detail explanation enter link description here
Another tool you can use is registry-cli. For example, this command:
registry.py -l "login:password" -r https://your-registry.example.com --delete
will delete all but the last 10 images.
There is also a way you can remove some old images from repository just based on the date when it was created.
To do that enter your docker registry container and get the list of manifest's revisions for some specific repository:
ls -latr /var/lib/registry/docker/registry/v2/repositories/YOUR_REPO/_manifests/revisions/sha256/
The output then may be used within the request (with sha256 prefix):
curl -v --silent -H "Accept: application/vnd.docker.distribution.manifest.v2+json" -X DELETE http://DOCKER_REGISTRY_HOST:5000/v2/YOUR_REPO/manifests/sha256:OUTPUT_LINE
And of course do not forget to execute 'garbage-collect' command after that:
bin/registry garbage-collect /etc/docker/registry/config.yml
I am usually all for doing things with scripts, but if you are already running
a registry UI container built from Joxit/docker-registry-ui, I found it easier to just opt-click the delete button in the UI and delete a page of images at a time, then garbage collect after.
This docker image includes a bash script that can be used to remove images from a remote v2 registry :
https://hub.docker.com/r/vidarl/remove_image_from_registry/
Below Bash Script Deletes all the tags located in registry except the latest.
for D in /registry-data/docker/registry/v2/repositories/*; do
if [ -d "${D}" ]; then
if [ -z "$(ls -A ${D}/_manifests/tags/)" ]; then
echo ''
else
for R in $(ls -t ${D}/_manifests/tags/ | tail -n +2); do
digest=$(curl -k -I -s -H -X GET http://xx.xx.xx.xx:5000/v2/$(basename ${D})/manifests/${R} -H 'accept: application/vnd.docker.distribution.manifest.v2+json' | grep Docker-Content-Digest | awk '{print $2}' )
url="http://xx.xx.xx.xx:5000/v2/$(basename ${D})/manifests/$digest"
url=${url%$'\r'}
curl -X DELETE -k -I -s $url -H 'accept: application/vnd.docker.distribution.manifest.v2+json'
done
fi
fi
done
After this Run
docker exec $(docker ps | grep registry | awk '{print $1}') /bin/registry garbage-collect /etc/docker/registry/config.yml
Simple ruby script based on this answer: registry_cleaner.
You can run it on local machine:
./registry_cleaner.rb --host=https://registry.exmpl.com --repository=name --tags_count=4
And then on the registry machine remove blobs with /bin/registry garbage-collect /etc/docker/registry/config.yml.
Here is a script based on Yavuz Sert's answer.
It deletes all tags that are not the latest version, and their tag is greater than 950.
#!/usr/bin/env bash
CheckTag(){
Name=$1
Tag=$2
Skip=0
if [[ "${Tag}" == "latest" ]]; then
Skip=1
fi
if [[ "${Tag}" -ge "950" ]]; then
Skip=1
fi
if [[ "${Skip}" == "1" ]]; then
echo "skip ${Name} ${Tag}"
else
echo "delete ${Name} ${Tag}"
Sha=$(curl -v -s -H "Accept: application/vnd.docker.distribution.manifest.v2+json" -X GET http://127.0.0.1:5000/v2/${Name}/manifests/${Tag} 2>&1 | grep Docker-Content-Digest | awk '{print ($3)}')
Sha="${Sha/$'\r'/}"
curl -H "Accept: application/vnd.docker.distribution.manifest.v2+json" -X DELETE "http://127.0.0.1:5000/v2/${Name}/manifests/${Sha}"
fi
}
ScanRepository(){
Name=$1
echo "Repository ${Name}"
curl -s http://127.0.0.1:5000/v2/${Name}/tags/list | jq '.tags[]' |
while IFS=$"\n" read -r line; do
line="${line%\"}"
line="${line#\"}"
CheckTag $Name $line
done
}
JqPath=$(which jq)
if [[ "x${JqPath}" == "x" ]]; then
echo "Couldn't find jq executable."
exit 2
fi
curl -s http://127.0.0.1:5000/v2/_catalog | jq '.repositories[]' |
while IFS=$"\n" read -r line; do
line="${line%\"}"
line="${line#\"}"
ScanRepository $line
done
A script to remove all but the latest tag from an insecure registry (private, no auth):
#!/bin/sh -eu
repo=$1
registry=${2-localhost:5000}
tags=`curl -sS "$registry/v2/$repo/tags/list" | jq -r .tags[]`
tag2digest() {
local tag=$1
curl -sS -H 'Accept: application/vnd.docker.distribution.manifest.v2+json' \
-o /dev/null \
-w '%header{Docker-Content-Digest}' \
"$registry/v2/$repo/manifests/$tag"
}
latest_digest=`tag2digest latest`
digests=`echo "$tags" \
| while IFS= read -r tag; do
tag2digest "$tag"
echo
done \
| sort \
| uniq`
digests=`echo "$digests" \
| grep -Fvx "$latest_digest"`
echo "$digests" \
| while IFS= read -r digest; do
curl -sS -X DELETE "$registry/v2/$repo/manifests/$digest"
done
Usage:
$ ./rm-tags.sh <image> [<registry>]
After removing tags (or manifests to be more precise) run garbage collection:
$ registry garbage-collect /etc/docker/registry/config.yml
To support Docker Hub and/or auth see these answers.

How can I find a Docker image with a specific tag in Docker registry on the Docker command line?

I try to locate one specific tag for a Docker image. How can I do it on the command line? I want to avoid downloading all the images and then removing the unneeded ones.
In the official Ubuntu release, https://registry.hub.docker.com/_/ubuntu/, there are several tags (release for it), while when I search it on the command line,
user#ubuntu:~$ docker search ubuntu | grep ^ubuntu
ubuntu Official Ubuntu base image 354
ubuntu-upstart Upstart is an event-based replacement for ... 7
ubuntufan/ping 0
ubuntu-debootstrap 0
Also in the help of command line search https://docs.docker.com/engine/reference/commandline/search/, no clue how it can work?
Is it possible in the docker search command?
If I use a raw command to search via the Docker registry API, then the information can be fetched:
$ curl https://registry.hub.docker.com//v1/repositories/ubuntu/tags | python -mjson.tool
[
{
"layer": "ef83896b",
"name": "latest"
},
.....
{
"layer": "463ff6be",
"name": "raring"
},
{
"layer": "195eb90b",
"name": "saucy"
},
{
"layer": "ef83896b",
"name": "trusty"
}
]
When using CoreOS, jq is available to parse JSON data.
So like you were doing before, looking at library/centos:
$ curl -s -S 'https://registry.hub.docker.com/v2/repositories/library/centos/tags/' | jq '."results"[]["name"]' |sort
"6"
"6.7"
"centos5"
"centos5.11"
"centos6"
"centos6.6"
"centos6.7"
"centos7.0.1406"
"centos7.1.1503"
"latest"
The cleaner v2 API is available now, and that's what I'm using in the example. I will build a simple script docker_remote_tags:
#!/usr/bin/bash
curl -s -S "https://registry.hub.docker.com/v2/repositories/library/$#/tags/" | jq '."results"[]["name"]' |sort
Enables:
$ ./docker_remote_tags library/centos
"6"
"6.7"
"centos5"
"centos5.11"
"centos6"
"centos6.6"
"centos6.7"
"centos7.0.1406"
"centos7.1.1503"
"latest"
Reference:
jq: https://stedolan.github.io/jq/ | apt-get install jq
I didn't like any of the solutions above because A) they required external libraries that I didn't have and didn't want to install. B) I didn't get all the pages.
The Docker API limits you to 100 items per request. This will loop over each "next" item and get them all (for Python it's seven pages; other may be more or less... It depends)
If you really want to spam yourself, remove | cut -d '-' -f 1 from the last line, and you will see absolutely everything.
url=https://registry.hub.docker.com/v2/repositories/library/redis/tags/?page_size=100 `# Initial url` ; \
( \
while [ ! -z $url ]; do `# Keep looping until the variable url is empty` \
>&2 echo -n "." `# Every iteration of the loop prints out a single dot to show progress as it got through all the pages (this is inline dot)` ; \
content=$(curl -s $url | python -c 'import sys, json; data = json.load(sys.stdin); print(data.get("next", "") or ""); print("\n".join([x["name"] for x in data["results"]]))') `# Curl the URL and pipe the output to Python. Python will parse the JSON and print the very first line as the next URL (it will leave it blank if there are no more pages) then continue to loop over the results extracting only the name; all will be stored in a variable called content` ; \
url=$(echo "$content" | head -n 1) `# Let's get the first line of content which contains the next URL for the loop to continue` ; \
echo "$content" | tail -n +2 `# Print the content without the first line (yes +2 is counter intuitive)` ; \
done; \
>&2 echo `# Finally break the line of dots` ; \
) | cut -d '-' -f 1 | sort --version-sort | uniq;
Sample output:
$ url=https://registry.hub.docker.com/v2/repositories/library/redis/tags/?page_size=100 `#initial url` ; \
> ( \
> while [ ! -z $url ]; do `#Keep looping until the variable url is empty` \
> >&2 echo -n "." `#Every iteration of the loop prints out a single dot to show progress as it got through all the pages (this is inline dot)` ; \
> content=$(curl -s $url | python -c 'import sys, json; data = json.load(sys.stdin); print(data.get("next", "") or ""); print("\n".join([x["name"] for x in data["results"]]))') `# Curl the URL and pipe the JSON to Python. Python will parse the JSON and print the very first line as the next URL (it will leave it blank if there are no more pages) then continue to loop over the results extracting only the name; all will be store in a variable called content` ; \
> url=$(echo "$content" | head -n 1) `#Let's get the first line of content which contains the next URL for the loop to continue` ; \
> echo "$content" | tail -n +2 `#Print the content with out the first line (yes +2 is counter intuitive)` ; \
> done; \
> >&2 echo `#Finally break the line of dots` ; \
> ) | cut -d '-' -f 1 | sort --version-sort | uniq;
...
2
2.6
2.6.17
2.8
2.8.6
2.8.7
2.8.8
2.8.9
2.8.10
2.8.11
2.8.12
2.8.13
2.8.14
2.8.15
2.8.16
2.8.17
2.8.18
2.8.19
2.8.20
2.8.21
2.8.22
2.8.23
3
3.0
3.0.0
3.0.1
3.0.2
3.0.3
3.0.4
3.0.5
3.0.6
3.0.7
3.0.504
3.2
3.2.0
3.2.1
3.2.2
3.2.3
3.2.4
3.2.5
3.2.6
3.2.7
3.2.8
3.2.9
3.2.10
3.2.11
3.2.100
4
4.0
4.0.0
4.0.1
4.0.2
4.0.4
4.0.5
4.0.6
4.0.7
4.0.8
32bit
alpine
latest
nanoserver
windowsservercore
If you want the bash_profile version:
function docker-tags () {
name=$1
# Initial URL
url=https://registry.hub.docker.com/v2/repositories/library/$name/tags/?page_size=100
(
# Keep looping until the variable URL is empty
while [ ! -z $url ]; do
# Every iteration of the loop prints out a single dot to show progress as it got through all the pages (this is inline dot)
>&2 echo -n "."
# Curl the URL and pipe the output to Python. Python will parse the JSON and print the very first line as the next URL (it will leave it blank if there are no more pages)
# then continue to loop over the results extracting only the name; all will be stored in a variable called content
content=$(curl -s $url | python -c 'import sys, json; data = json.load(sys.stdin); print(data.get("next", "") or ""); print("\n".join([x["name"] for x in data["results"]]))')
# Let's get the first line of content which contains the next URL for the loop to continue
url=$(echo "$content" | head -n 1)
# Print the content without the first line (yes +2 is counter intuitive)
echo "$content" | tail -n +2
done;
# Finally break the line of dots
>&2 echo
) | cut -d '-' -f 1 | sort --version-sort | uniq;
}
And simply call it: docker-tags redis
Sample output:
$ docker-tags redis
...
2
2.6
2.6.17
2.8
--trunc----
32bit
alpine
latest
nanoserver
windowsservercore
As far as I know, the CLI does not allow searching/listing tags in a repository.
But if you know which tag you want, you can pull that explicitly by adding a colon and the image name: docker pull ubuntu:saucy
This script (docker-show-repo-tags.sh) should work for any Docker enabled host that has curl, sed, grep, and sort. This was updated to reflect the fact the repository tag URLs changed.
This version correctly parses the "name": field without a JSON parser.
#!/bin/sh
# 2022-07-20
# Simple script that will display Docker repository tags
# using basic tools: curl, awk, sed, grep, and sort.
# Usage:
# $ docker-show-repo-tags.sh ubuntu centos
# $ docker-show-repo-tags.sh centos | cat -n
for Repo in "$#" ; do
URL="https://registry.hub.docker.com/v2/repositories/library/$Repo/tags/"
curl -sS "$URL" | \
/usr/bin/sed -Ee 's/("name":)"([^"]*)"/\n\1\2\n/g' | \
grep '"name":' | \
awk -F: '{printf("'$Repo':%s\n",$2)}'
done
This older version no longer works. Many thanks to #d9k for pointing this out!
#!/bin/sh
# WARNING: This no long works!
# Simple script that will display Docker repository tags
# using basic tools: curl, sed, grep, and sort.
#
# Usage:
# $ docker-show-repo-tags.sh ubuntu centos
for Repo in $* ; do
curl -sS "https://hub.docker.com/r/library/$Repo/tags/" | \
sed -e $'s/"tags":/\\\n"tags":/g' -e $'s/\]/\\\n\]/g' | \
grep '^"tags"' | \
grep '"library"' | \
sed -e $'s/,/,\\\n/g' -e 's/,//g' -e 's/"//g' | \
grep -v 'library:' | \
sort -fu | \
sed -e "s/^/${Repo}:/"
done
This older version no longer works. Many thanks to #viky for pointing this out!
#!/bin/sh
# WARNING: This no long works!
# Simple script that will display Docker repository tags.
#
# Usage:
# $ docker-show-repo-tags.sh ubuntu centos
for Repo in $* ; do
curl -s -S "https://registry.hub.docker.com/v2/repositories/library/$Repo/tags/" | \
sed -e $'s/,/,\\\n/g' -e $'s/\[/\\\[\n/g' | \
grep '"name"' | \
awk -F\" '{print $4;}' | \
sort -fu | \
sed -e "s/^/${Repo}:/"
done
This is the output for a simple example:
$ docker-show-repo-tags.sh centos | cat -n
1 centos:5
2 centos:5.11
3 centos:6
4 centos:6.10
5 centos:6.6
6 centos:6.7
7 centos:6.8
8 centos:6.9
9 centos:7.0.1406
10 centos:7.1.1503
11 centos:7.2.1511
12 centos:7.3.1611
13 centos:7.4.1708
14 centos:7.5.1804
15 centos:centos5
16 centos:centos5.11
17 centos:centos6
18 centos:centos6.10
19 centos:centos6.6
20 centos:centos6.7
21 centos:centos6.8
22 centos:centos6.9
23 centos:centos7
24 centos:centos7.0.1406
25 centos:centos7.1.1503
26 centos:centos7.2.1511
27 centos:centos7.3.1611
28 centos:centos7.4.1708
29 centos:centos7.5.1804
30 centos:latest
I wrote a command line tool to simplify searching Docker Hub repository tags, available in my PyTools GitHub repository. It's simple to use with various command line switches, but most basically:
./dockerhub_show_tags.py repo1 repo2
It's even available as a Docker image and can take multiple repositories:
docker run harisekhon/pytools dockerhub_show_tags.py centos ubuntu
DockerHub
repo: centos
tags: 5.11
6.6
6.7
7.0.1406
7.1.1503
centos5.11
centos6.6
centos6.7
centos7.0.1406
centos7.1.1503
repo: ubuntu
tags: latest
14.04
15.10
16.04
trusty
trusty-20160503.1
wily
wily-20160503
xenial
xenial-20160503
If you want to embed it in scripts, use -q / --quiet to get just the tags, like normal Docker commands:
./dockerhub_show_tags.py centos -q
5.11
6.6
6.7
7.0.1406
7.1.1503
centos5.11
centos6.6
centos6.7
centos7.0.1406
centos7.1.1503
The v2 API seems to use some kind of pagination, so that it does not return all the available tags. This is clearly visible in projects such as python (or library/python). Even after quickly reading the documentation, I could not manage to work with the API correctly (maybe it is the wrong documentation).
Then I rewrote the script using the v1 API, and it is still using jq:
#!/bin/bash
repo="$1"
if [[ "${repo}" != */* ]]; then
repo="library/${repo}"
fi
url="https://registry.hub.docker.com/v1/repositories/${repo}/tags"
curl -s -S "${url}" | jq '.[]["name"]' | sed 's/^"\(.*\)"$/\1/' | sort
The full script is available at: https://github.com/denilsonsa/small_scripts/blob/master/docker_remote_tags.sh
I've also written an improved version (in Python) that aggregates tags that point to the same version: https://github.com/denilsonsa/small_scripts/blob/master/docker_remote_tags.py
Add this function to your .zshrc file or run the command manually:
#usage list-dh-tags <repo>
#example: list-dh-tags node
function list-dh-tags(){
wget -q https://registry.hub.docker.com/v1/repositories/$1/tags -O - | sed -e 's/[][]//g' -e 's/"//g' -e 's/ //g' | tr '}' '\n' | awk -F: '{print $3}'
}
Thanks to this -> How can I list all tags for a Docker image on a remote registry?
For anyone stumbling across this in modern times, you can use Skopeo to retrieve an image's tags from the Docker registry:
$ skopeo list-tags docker://jenkins/jenkins \
| jq -r '.Tags[] | select(. | contains("lts-alpine"))' \
| sort --version-sort --reverse
lts-alpine
2.277.3-lts-alpine
2.277.2-lts-alpine
2.277.1-lts-alpine
2.263.4-lts-alpine
2.263.3-lts-alpine
2.263.2-lts-alpine
2.263.1-lts-alpine
2.249.3-lts-alpine
2.249.2-lts-alpine
2.249.1-lts-alpine
2.235.5-lts-alpine
2.235.4-lts-alpine
2.235.3-lts-alpine
2.235.2-lts-alpine
2.235.1-lts-alpine
2.222.4-lts-alpine
Reimplementation of the previous post, using Python over sed/AWK:
for Repo in $* ; do
tags=$(curl -s -S "https://registry.hub.docker.com/v2/repositories/library/$Repo/tags/")
python - <<EOF
import json
tags = [t['name'] for t in json.loads('''$tags''')['results']]
tags.sort()
for tag in tags:
print "{}:{}".format('$Repo', tag)
EOF
done
For a script that works with OAuth bearer tokens on Docker Hub, try this:
Listing the tags of a Docker image on a Docker hub through the HTTP API
You can use Visual Studio Code to provide autocomplete for available Docker images and tags. However, this requires that you type the first letter of a tag in order to see autocomplete suggestions.
For example, when writing FROM ubuntu it offers autocomplete suggestions like ubuntu, ubuntu-debootstrap and ubuntu-upstart. When writing FROM ubuntu:a it offers autocomplete suggestions, like ubuntu:artful and ubuntu:artful-20170511.1

Spider a Website and Return URLs Only

I'm looking for a way to pseudo-spider a website. The key is that I don't actually want the content, but rather a simple list of URIs. I can get reasonably close to this idea with Wget using the --spider option, but when piping that output through a grep, I can't seem to find the right magic to make it work:
wget --spider --force-html -r -l1 http://somesite.com | grep 'Saving to:'
The grep filter seems to have absolutely no affect on the wget output. Have I got something wrong or is there another tool I should try that's more geared towards providing this kind of limited result set?
UPDATE
So I just found out offline that, by default, wget writes to stderr. I missed that in the man pages (in fact, I still haven't found it if it's in there). Once I piped the return to stdout, I got closer to what I need:
wget --spider --force-html -r -l1 http://somesite.com 2>&1 | grep 'Saving to:'
I'd still be interested in other/better means for doing this kind of thing, if any exist.
The absolute last thing I want to do is download and parse all of the content myself (i.e. create my own spider). Once I learned that Wget writes to stderr by default, I was able to redirect it to stdout and filter the output appropriately.
wget --spider --force-html -r -l2 $url 2>&1 \
| grep '^--' | awk '{ print $3 }' \
| grep -v '\.\(css\|js\|png\|gif\|jpg\)$' \
> urls.m3u
This gives me a list of the content resource (resources that aren't images, CSS or JS source files) URIs that are spidered. From there, I can send the URIs off to a third party tool for processing to meet my needs.
The output still needs to be streamlined slightly (it produces duplicates as it's shown above), but it's almost there and I haven't had to do any parsing myself.
Create a few regular expressions to extract the addresses from all
<a href="(ADDRESS_IS_HERE)">.
Here is the solution I would use:
wget -q http://example.com -O - | \
tr "\t\r\n'" ' "' | \
grep -i -o '<a[^>]\+href[ ]*=[ \t]*"\(ht\|f\)tps\?:[^"]\+"' | \
sed -e 's/^.*"\([^"]\+\)".*$/\1/g'
This will output all http, https, ftp, and ftps links from a webpage. It will not give you relative urls, only full urls.
Explanation regarding the options used in the series of piped commands:
wget -q makes it not have excessive output (quiet mode).
wget -O - makes it so that the downloaded file is echoed to stdout, rather than saved to disk.
tr is the unix character translator, used in this example to translate newlines and tabs to spaces, as well as convert single quotes into double quotes so we can simplify our regular expressions.
grep -i makes the search case-insensitive
grep -o makes it output only the matching portions.
sed is the Stream EDitor unix utility which allows for filtering and transformation operations.
sed -e just lets you feed it an expression.
Running this little script on "http://craigslist.org" yielded quite a long list of links:
http://blog.craigslist.org/
http://24hoursoncraigslist.com/subs/nowplaying.html
http://craigslistfoundation.org/
http://atlanta.craigslist.org/
http://austin.craigslist.org/
http://boston.craigslist.org/
http://chicago.craigslist.org/
http://cleveland.craigslist.org/
...
I've used a tool called xidel
xidel http://server -e '//a/#href' |
grep -v "http" |
sort -u |
xargs -L1 -I {} xidel http://server/{} -e '//a/#href' |
grep -v "http" | sort -u
A little hackish but gets you closer! This is only the first level. Imagine packing this up into a self recursive script!

Resources