How to delete old image that created in build process automaticly? - docker

I have a ci/cd automation in my project with gitlab, and after push my code on master branch, gitlab runner create a new docker image on my server and set latest commit hash as tag on that image, and recreate container with new image. after a while, there is a lot of unused image, and I want to delete them automaticly.
I delete old image manually.
This is my Makefile
NAME := farvisun/javabina
TAG := $$(git log -1 --pretty=%h)
IMG := ${NAME}:${TAG}
LATEST := ${NAME}:latest
app_build:
docker build -t ${IMG} -f javabina.dockerfile . && \
docker tag ${IMG} ${LATEST}
app_up:
docker-compose -p farvisun-javabina up -d javabina
And after all of this, I want a simple bash code, or other tools, to delete unused image, for example keep 3 lastest image, or keep last 2 day past builds, and delete others.

If you are fine with keeping a single image, you can use docker image prune -f, which will remove all images but the ones associated with a container, so if you run this command while the container is running, it will remove the rest of the images.
Don't forget to run docker system prune every often as well to further reduce storage usage.
In your situation where you need to keep more than one image, you can try this:
#!/bin/bash
for tag in $(docker image ls | sed 1,4d | awk '{print $3}')
do
docker image rm -f $tag
done
The first line will list all docker images, remove from the list the first 3 ones which you want to keep, and select only the column with the image ID. Then, for each of the IDs, we remove the image.
If you want to remove more, change the 4d for another number. Notice that the first line is a header, so it must always be removed.
If you want to filter the images by a tag first, you can make your own filters.

You can schedule (e.g. once a day or once a week) in the compilation machine a "docker image prune" command.
https://docs.docker.com/engine/reference/commandline/image_prune/

Here is a way we can remove old images, after a new successful build in our pipeline
# build
- docker build -t APP_NAME:$(git describe --tags --no-abbrev) .
# docker tag
- docker tag APP_NAME:$(git describe --tags --no-abbrev) artifactory.XYZ.com/APP_NAME:latest
# remote old images
- sha=$(docker image inspect artifactory.XYZ.com/APP_NAME:latest -f '{{.ID}}')
- image_sha=$(echo $sha | cut -d':' -f2)
- image_id=$(echo $image_sha | head -c 12)
- docker image ls | grep APP_NAME | while read name tag id others; do if ! [ $id = $image_id ]; then docker image rm --force $id; fi ; done

Related

Delete docker images from nexus registry using api

There's a nexus setup running for docker registry. I'm struggling to delete old/unnecessary images from nexus setup using the APIs.So far I'm aware of below available APIs. There are 2 requirements:
Delete images older than 30 days.
Keep at least 5 tags of each image.
The delete api can only delete using the digest of the images but I"m not sure how to find exact one for the tags of images. Search api don't seem to work for docker images. Can someone please help?
## Search api
https://help.sonatype.com/repomanager3/integrations/rest-and-integration-api/search-api?_ga=2.253346826.2007475959.1640178248-1042170715.1640178248#SearchAPI-SearchComponents
## Find all catalog images under docker registery
curl -u admin:adminPass -X "GET" nexus.example.com/v2/_catalog | jq
## Get all tags of an image
curl -u admin:adminPass -X "GET" nexus.example.com/v2/abc-web-service-prod/tags/list
## Get manifests
curl -u admin:adminPass -X "GET" "nexus.example.com/v2/abc-web-service-stage-2/manifests/5.2.6_1" | jq
## Delete by digest
curl -i -u admin:adminPass -X "DELETE" "nexus.example.com/v2/abc-web-service/manifests/sha256:8829ce7278c1151f61438dcfea20e3694fee2241a75737e3a8de31a27f0014a5"
Two things are missing from the "get manifests" example. First, if you include the http headers, you'll likely get the digest field, or you can skip the jq and pipe the result into a sha256sum to get the digest. But you also need to add an "Accept" header for the various media types of a manifest, otherwise the registry will convert it to an older schema v1 syntax which will not have the same digest. Here's an example that does the two v2 docker media types:
api="application/vnd.docker.distribution.manifest.v2+json"
apil="application/vnd.docker.distribution.manifest.list.v2+json"
curl -H "Accept: ${api}" -H "Accept: ${apil}" \
-u admin:adminPass \
-I -s "nexus.example.com/v2/abc-web-service-stage-2/manifests/5.2.6_1"
The next issue you'll run into with your policy is the 30 day requirement. You can get the creation time on many images by pulling their image config blob (it's listed in the manifest), but that date will be when the image was created, not when it was pushed or last pulled. There have been suggestions to add API's to OCI to handle more metadata, but we're still a ways off from that, and further still to get registry providers to implement them. So you'd end up deleting things that are likely being used. Even the 5 tag rule can be problematic if several new tags are created working through bugs in CI and you age out the image currently deployed in production.
With that all said, some tooling that I work on called regclient may help. The regctl command gives you a way to script this in a shell, e.g.:
#!/bin/sh
registry="nexus.example.com"
cutoff="$(date -d -30days '+%s')"
for repo in $(regctl repo ls "$registry"); do
# The "head -n -5" ignores the last 5 tags, but you may want to sort that list first.
for tag in $(regctl tag ls "$registry/$repo" | head -n -5); do
# This is the most likely command to fail since the created timestamp is optional, may be set to 0,
# and the string format might vary.
# The cut is to remove the "+0000" that breaks the "date" command.
created="$(regctl image config "$registry/$repo:$tag" --format '{{.Created}}' | cut -f1,2,4 -d' ')"
createdSec="$(date -d "$created" '+%s')"
# both timestamps are converted to seconds since epoc, allowing numeric comparison
if [ "$createdSec" -lt "$cutoff" ]; then
# next line is prefixed with echo for debugging, delete the echo to run the tag delete command
echo regctl tag rm "$registry/$repo:$tag"
fi
done
done
Note that I'm using "regctl tag rm" above, which is different from an image manifest delete you're seeing in the API. This will attempt to do an OCI tag delete API first, which likely isn't supported by your registry. It falls back to pushing a dummy manifest and deleting that. The alternative of deleting the current manifest the tag points to is you may delete more tags than intended (you could have 5 tags all pointing to the same manifest).
If you want to further automate this, regbot in that same repo lets you build a policy and run it on a schedule to constantly cleanup old images according to your rules.
In addition to regclient, there's also crane and skopeo that may also help in this space, but the features of each of these will vary.
I found a great solution to this.
https://github.com/andrey-pohilko/registry-cli
1. Create a docker image [name: registry-cli:1.0.1] using below Dockerfile
ADD requirements-build.txt /
RUN pip install -r /requirements-build.txt
ADD registry.py /
ENTRYPOINT ["/registry.py"]
2. Use below command to list down all images:tags in your private nexus registry.
docker run --rm registry-cli:1.0.1 -l admin:adminPass -r http://nexus.example.com
3. To get all tags of a particular image.
docker run --rm registry-cli:1.0.1 -l admin:adminPass -r http://nexus.example.com-i <name-of-the-image1> <name-of-the-image2>
4. To delete all old tags of a particular image but keep latest 10 tags.
docker run --rm registry-cli:1.0.1 -l admin:adminPass -r http://nexus.example.com -i <name-of-the-image1> --delete
5. To delete all the old tags of all the images in the repository but keep 10 latest tags of each image
docker run --rm registry-cli:1.0.1 -l admin:adminPass -r http://nexus.example.com --delete
6. If you wish to keep 20 images instead of 10 then use --num
docker run --rm registry-cli:1.0.1 -l admin:adminPass -r http://nexus.example.com --delete --num 20
7. Once you're done deleting the older tags of the images, run task "delete unused manifests and docker images"
8. Post step:7, run compaction task to reclaim the storage.

Can I directly deploy generated docker image without pushing it to DockerHub?

I don't want to push a docker build image to DockerHub. Is there any way to directly deploy a docker image from CircleCI to AWS/vps/vultr without having to push it to DockerHub?
I use docker save/load commands:
# save image to tar locally
docker save -o ./image.tar $IMAGEID
# copy to target host
scp ./image.tar user#host:~/
# load into target docker repo
ssh user#host "docker load -i ~/image.tar"
# tag the loaded target image
ssh user#host "docker tag $LOADED_IMAGE_ID myimage:latest"
PS: LOADED_IMAGE_ID can be retrieved in following way:
REMOTE_IMAGE_ID=`ssh user#host"docker load -i ~/image.tar" | grep -o "sha256:.*"`
Update:
You can gzip output to make it smaller. (Don't forget unzip the image archive before load)
docker save $IMAGEID | gzip > image.tar.gz
You could setup your own registry: https://docs.docker.com/registry/deploying/
Edit: As i.bondarenko said, docker save/load are the better commands for your needs.
Disclaimer: I am the author of Dogger.
I made a blog post about it here, which allows just that: https://medium.com/#mathiaslykkegaardlorenzen/hosting-a-docker-app-without-pushing-an-image-d4503de37b89

docker get rid of every image of specific type except latest

I'd like to automize the removal of docker layers which just aren't used anymore since docker loves to gobble up hard drive space.
So I'd like a script that would remove all images of a specific type except the last used image. So I'm guessing there
REPOSITORY TAG IMAGE ID CREATED SIZE
mop-test-image b4ffabd a16fc65f4d19 10 minutes ago 1.95GB
mop-test-image e7e5b14 7971bf4c01ce 17 minutes ago 1.95GB
mop-test-image 4325d4e d6a3377f609a 32 minutes ago 1.95GB
So in the following list above I'd like all the images removed except for the one created 10 minutes ago.
I use this currently to remove all of the images of that kind, so it needs tweaking:
docker rmi $(docker images | grep test- | tr -s ' ' | cut -d ' ' -f 3)
using until:
docker image prune -a --force --filter "until=10m"
delete all images older than 10 mins
you can also use --filter using repository or ID to select only some types of the images
see more here
If you are more interested in a solution that won’t require knowledge of how old the cutoff date is, the following script should do it.
It simply loops over the docker images, stores the image name in the lookup if it hasn’t been seen before, otherwise deletes it.
https://gist.github.com/Mazyod/da92f8cda1783baa017f9323375c159c
#!/bin/bash
set -e
echo "Script for cleaning up Docker images"
# First, we grab a list of all images
docker_images=$(docker images --format "{{.ID}}|{{.Repository}}|{{.Tag}}")
# prepare a image name lookup
declare -A image_names
# Then, we loop through the list
while read -r line; do
# We split the line into an array
IFS='|' read -r -a array <<< "$line"
# We grab the image ID
image_id=${array[0]}
# We grab the image name
image_name=${array[1]}
# We grab the image tag
image_tag=${array[2]}
# We check if the image_name has already been saved in image_names
if [[ -z "${image_names[$image_name]}" ]]; then
# If not, we save it
echo "Keeping ${image_name}:${image_tag}"
image_names[$image_name]=$image_id
else
# If yes, we remove the image
echo "Removing ${image_name}:${image_tag}"
docker rmi "${image_name}:${image_tag}"
fi
done <<< "$docker_images"

How to verify if the content of two Docker images is exactly the same?

How can we determine that two Docker images have exactly the same file system structure, and that the content of corresponding files is the same, irrespective of file timestamps?
I tried the image IDs but they differ when building from the same Dockerfile and a clean local repository. I did this test by building one image, cleaning the local repository, then touching one of the files to change its modification date, then building the second image, and their image IDs do not match. I used Docker 17.06 (the latest version I believe).
If you want to compare content of images you can use docker inspect <imageName> command and you can look at section RootFS
docker inspect redis
"RootFS": {
"Type": "layers",
"Layers": [
"sha256:eda7136a91b7b4ba57aee64509b42bda59e630afcb2b63482d1b3341bf6e2bbb",
"sha256:c4c228cb4e20c84a0e268dda4ba36eea3c3b1e34c239126b6ee63de430720635",
"sha256:e7ec07c2297f9507eeaccc02b0148dae0a3a473adec4ab8ec1cbaacde62928d9",
"sha256:38e87cc81b6bed0c57f650d88ed8939aa71140b289a183ae158f1fa8e0de3ca8",
"sha256:d0f537e75fa6bdad0df5f844c7854dc8f6631ff292eb53dc41e897bc453c3f11",
"sha256:28caa9731d5da4265bad76fc67e6be12dfb2f5598c95a0c0d284a9a2443932bc"
]
}
if all layers are identical then images contains identical content
After some research I came up with a solution which is fast and clean per my tests.
The overall solution is this:
Create a container for your image via docker create ...
Export its entire file system to a tar archive via docker export ...
Pipe the archive directory names, symlink names, symlink contents, file names, and file contents, to an hash function (e.g., MD5)
Compare the hashes of different images to verify if their contents are equal or not
And that's it.
Technically, this can be done as follows:
1) Create file md5docker, and give it execution rights, e.g., chmod +x md5docker:
#!/bin/sh
dir=$(dirname "$0")
docker create $1 | { read cid; docker export $cid | $dir/tarcat | md5; docker rm $cid > /dev/null; }
2) Create file tarcat, and give it execution rights, e.g., chmod +x tarcat:
#!/usr/bin/env python3
# coding=utf-8
if __name__ == '__main__':
import sys
import tarfile
with tarfile.open(fileobj=sys.stdin.buffer, mode="r|*") as tar:
for tarinfo in tar:
if tarinfo.isfile():
print(tarinfo.name, flush=True)
with tar.extractfile(tarinfo) as file:
sys.stdout.buffer.write(file.read())
elif tarinfo.isdir():
print(tarinfo.name, flush=True)
elif tarinfo.issym() or tarinfo.islnk():
print(tarinfo.name, flush=True)
print(tarinfo.linkname, flush=True)
else:
print("\33[0;31mIGNORING:\33[0m ", tarinfo.name, file=sys.stderr)
3) Now invoke ./md5docker <image>, where <image> is your image name or id, to compute an MD5 hash of the entire file system of your image.
To verify if two images have the same contents just check that their hashes are equal as computed in step 3).
Note that this solution only considers as content directory structure, regular file contents, and symlinks (soft and hard). If you need more just change the tarcat script by adding more elif clauses testing for the content you wish to include (see Python's tarfile, and look for methods TarInfo.isXXX() corresponding to the needed content).
The only limitation I see in this solution is its dependency on Python (I am using Python3, but it should be very easy to adapt to Python2). A better solution without any dependency, and probably faster (hey, this is already very fast), is to write the tarcat script in a language supporting static linking so that a standalone executable file was enough (i.e., one not requiring any external dependencies, but the sole OS). I leave this as a future exercise in C, Rust, OCaml, Haskell, you choose.
Note, if MD5 does not suit your needs, just replace md5 inside the first script with your hash utility.
Hope this helps anyone reading.
Amazes me that docker doesn't do this sort of thing out of the box. Here's a variant on #mljrg's technique:
#!/bin/sh
docker create $1 | {
read cid
docker export $cid | tar Oxv 2>&1 | shasum -a 256
docker rm $cid > /dev/null
}
It's shorter, doesn't need a python dependency or a second script at all, I'm sure there are downsides but it seems to work for me with the few tests I've done.
There doesn't seem to be a standard way for doing this. The best way that I can think of is using the Docker multistage build feature.
For example, here I am comparing the apline and debian images. In yourm case set the image names to the ones you want to compare
I basically copy all the file from each image into a git repository and commit after each copy.
FROM alpine as image1
FROM debian as image2
FROM ubuntu
RUN apt-get update && apt-get install -y git
RUN git config --global user.email "you#example.com" &&\
git config --global user.name "Your Name"
RUN mkdir images
WORKDIR images
RUN git init
COPY --from=image1 / .
RUN git add . && git commit -m "image1"
COPY --from=image2 / .
RUN git add . && git commit -m "image2"
CMD tail > /dev/null
This will give you an image with a git repository that records the differences between the two images.
docker build -t compare .
docker run -it compare bash
Now if you do a git log you can see the logs and you can compare the two commits using git diff <commit1> <commit2>
Note: If the image building fails at the second commit, this means that the images are identical, since a git commit will fail if there are no changes to commit.
If we rebuild the Dockerfile it is almost certainly going to produce a new hash.
The only way to create an image with the same hash is to use docker save and docker load. See https://docs.docker.com/engine/reference/commandline/save/
We could then use Bukharov Sergey's answer (i.e. docker inspect) to inspect the layers, looking at the section with key 'RootFS'.

Detect if Docker image would change on running build

I am building a Docker image using a command line like the following:
docker build -t myimage .
Once this command has succeeded, then rerunning it is a no-op as the image specified by the Dockerfile has not changed. Is there a way to detect if the Dockerfile (or one of the build context files) subsequently changes without rerunning this command?
looking at docker inspect $image_name from one build to another, several information doesn't change if the docker image hasn't changed. One of them is the docker Id. So, I used the Id information to check if a docker has been changed as follows:
First, one can get the image Id as follows:
docker inspect --format {{.Id}} $docker_image_name
Then, to check if there is a change after a build, you can follow these steps:
Get the image id before the build
Build the image
Get the image id after the build
Compare the two ids, if they match there is no change, if they don't match, there was a change.
Code: Here is a working bash script implementing the above idea:
docker inspect --format {{.Id}} $docker_image_name > deploy/last_image_build_id.log
# I get the docker last image id from a file
last_docker_id=$(cat deploy/last_image_build_id.log)
docker build -t $docker_image_name .
docker_id_after_build=$(docker inspect --format {{.Id}} $docker_image_name)
if [ "$docker_id_after_build" != "$last_docker_id" ]; then
echo "image changed"
else
echo "image didn't change"
fi
There isn't a dry-run option if that's what you are looking for. You can use a different tag to avoid affecting existing images and look for ---> Using cache in the output (then delete the tag if you don't want it).

Resources