Rebuild a docker image depending on the output of a command - docker

Say I want to rebuild my docker image when a new compiler version is available in some repository.
I can gather the version inside the container:
FROM centos:7
RUN yum info gcc | grep Version | sort | tail -1 | cut -d: -f2 | tr -d ' '
If I build this container and tag it as base, I can use that information and setup a second container:
FROM base
RUN yum install gcc-4.8.5
Docker will be able to cache the second container and not rebuild it, when the compiler version has not changed. But creating that requires some shell scripting and that might be brittle in, e.g., a continuous integration scenario.
What I would like to do is to introduce a unified source for these two containers. Is there a way to write something like this:
FROM centos:7
$GCC_VERSION=RUN yum info gcc | grep Version | sort | tail -1 | cut -d: -f2 | tr -d ' '
RUN yum install gcc-$GCC_VERSION
and have the variables expanded (and the commands still cached) during docker build ?

You can use the ARG instruction. Build args affect the cache in the way you want to: Impact on build caching
With a Dockerfile like this:
FROM centos:7
ARG GCC_VERSION
RUN yum install -y gcc-$GCC_VERSION
The image gets only rebuilt if the GCC_VERSION changes.
docker build --build-arg GCC_VERSION=$(docker run centos:7 yum info gcc | grep Version | sort | tail -1 | cut -d: -f2 | tr -d ' ') .

Related

How to specify _HANDLER in AWS Lambda Docker Image

I am trying to create a docker image to run a shell script as aws lambda function. As far as I understood, I need a runtime and a handler.
The default runtime (bootstrap) looks like this:
#!/bin/sh
set -euo pipefail
# Handler format: <script_name>.<bash_function_name>
#
# The script file <script_name>.sh must be located at the root of your
# function's deployment package, alongside this bootstrap executable.
source $(dirname "$0")/"$(echo $_HANDLER | cut -d. -f1).sh"
while true
do
# Request the next event from the Lambda runtime
HEADERS="$(mktemp)"
EVENT_DATA=$(curl -v -sS -LD "$HEADERS" -X GET "http://${AWS_LAMBDA_RUNTIME_API}/2018-06-01/runtime/invocation/next")
INVOCATION_ID=$(grep -Fi Lambda-Runtime-Aws-Request-Id "$HEADERS" | tr -d '[:space:]' | cut -d: -f2)
# Execute the handler function from the script
RESPONSE=$($(echo "$_HANDLER" | cut -d. -f2) "$EVENT_DATA")
# Send the response to Lambda runtime
curl -v -sS -X POST "http://${AWS_LAMBDA_RUNTIME_API}/2018-06-01/runtime/invocation/$INVOCATION_ID/response" -d "$RESPONSE"
done
The Dockerfile:
FROM public.ecr.aws/lambda/provided:al2
COPY bootstrap ${LAMBDA_RUNTIME_DIR}
COPY function.sh ${LAMBDA_TASK_ROOT}
RUN … (install some dependencies)
CMD ["function.handler"]
From what I've read, your supposed to set CMD to the handler. However this does not seem to work as the $_HANDLER variable is not known inside of the container.
I can't figure out what makes the difference, but this combination of Dockerfile and bootstrap works.
#!/bin/sh
set -euo pipefail
source "$LAMBDA_TASK_ROOT"/"$(echo $_HANDLER | cut -d. -f1).sh"
while true
do
# Request the next event from the Lambda runtime
HEADERS="$(mktemp)"
EVENT_DATA=$(curl -sS -LD "$HEADERS" -X GET "http://${AWS_LAMBDA_RUNTIME_API}/2018-06-01/runtime/invocation/next")
INVOCATION_ID=$(grep -Fi Lambda-Runtime-Aws-Request-Id "$HEADERS" | tr -d '[:space:]' | cut -d: -f2)
# Execute the handler function from the script
RESPONSE=$($(echo "$_HANDLER" | cut -d. -f2) "$EVENT_DATA")
# Send the response to Lambda runtime
curl -sS -X POST "http://${AWS_LAMBDA_RUNTIME_API}/2018-06-01/runtime/invocation/$INVOCATION_ID/response" -d "$RESPONSE"
done
FROM public.ecr.aws/lambda/provided:al2
COPY ./bootstrap ${LAMBDA_RUNTIME_DIR}
COPY ./function.sh ${LAMBDA_TASK_ROOT}
RUN chmod 755 ${LAMBDA_RUNTIME_DIR}/bootstrap ${LAMBDA_TASK_ROOT}/function.sh
RUN …
WORKDIR ${LAMBDA_TASK_ROOT}
CMD ["function.handler"]

Dockerhub: listing all available versions of a given image?

I'm looking for a way to list all publicly available versions of an image from Dockerhub. Is there a way this could be achieved?
Specifically, I'm interested in the openjdk:8-jdk-alpine images.
Dockerhub typically only lists the latest version of each image, and there are no linking to historic versions. For openjdk, it's currently 8u191-jdk-alpine3.8:
However, it possible to pull older versions if we know their image digest ID:
openjdk:8-jdk-alpine#sha256:1fd5a77d82536c88486e526da26ae79b6cd8a14006eb3da3a25eb8d2d682ccd6
openjdk:8-jdk-alpine#sha256:c5c705b462abab858066d412b3f871865684d8f837571c98b68e78c505dc7549
With some luck, I was able to find these digests for OpenJDK 8 (Java versions 1.8.0_171 and 1.8.0_151 respectively), by googling openjdk8 alpine digest and looking at github tickets, which included the image digest.
But, is there a systematic way for listing all publicly available digests?
Looking at docker search documentation, there's doesn't seem to be an option for listing the image version, only search by name.
You don't need digests to pull "old" images, you would rather use their tags (even if they are not displayed in Docker Hub).
I use the following command to retrieve tags of a particular image, parsing the output of https://registry.hub.docker.com/v1/repositories/$REPOSITORY/tags :
REPOSITORY=openjdk # can be "<registry>/<image_name>" ("google/cloud-sdk" for example)
wget -q https://registry.hub.docker.com/v1/repositories/$REPOSITORY/tags -O - | \
jq -r '.[].name'
Result for REPOSITORY=openjdk (1593 tags at the time of writing) looks like :
latest
10
10-ea
10-ea-32
10-ea-32-experimental
10-ea-32-jdk
10-ea-32-jdk-experimental
10-ea-32-jdk-slim
10-ea-32-jdk-slim-experimental
10-ea-32-jre
[...]
If you can't/don't want to install jq (tool to manipulate JSON), then you could use :
wget -q https://registry.hub.docker.com/v1/repositories/$REPOSITORY/tags -O - | \
sed -e 's/[][]//g' -e 's/"//g' -e 's/ //g' | \
tr '}' '\n' | \
awk -F: '{print $3}'
(I'm pretty sure I got this command from another question, but I can't find where)
You can of course filter the output of this command and keep only tags you're interested in :
wget -q https://registry.hub.docker.com/v1/repositories/$REPOSITORY/tags -O - | \
jq -r '.[].name | select(match("^8.*jdk-alpine"))'
or :
wget -q https://registry.hub.docker.com/v1/repositories/$REPOSITORY/tags -O - | \
jq -r '.[].name' \
grep -E '^8.*jdk-alpine'

Docker caching for travis builds

Docker caching is not yet available on travis: https://github.com/travis-ci/travis-ci/issues/5358
I'm trying to write a workaround by doing:
`docker save -o file.tar $(docker history -q image_name | grep -v missing)`
`docker load -i file.tar
Which works great, gives me all the image layers back. My only problem now is the saving takes a long time, and most of the time I'm actually changing one layer, so I don't need to rewrite all the rest. Is there a way of telling the docker save command to skip layers already in file.tar?
In the manifest.json file inside the tar you have the information you need.
tar -xOf file.tar manifest.json
Check the value of the Config keys. The first 12 characters are the image id. You can use the command above, extract the image ids that you already have, and exclude them in your docker save command.
I'm not very good with bash scripting, but this works on my mac
tar -xOf file.tar manifest.json | tr , '\n' | grep -o '"Config":".*"' | awk -F ':' '{print $2}' | awk '{print substr($0,2,12)}'
Using this outputs everything
docker history -q IMAGE_HERE | grep -v missing && tar -xOf file.tar manifest.json | tr , '\n' | grep -o '"Config":".*"' | awk -F ':' '{print $2}' | awk '{print substr($0,2,12)}'
After this you only need to get the unique values. This could be done with sort and uniq -u, but for some reason, sort doesn't work as expected. This command assumes the presence of file.tar so take that into consideration too.
I couldn't find anything about append in the docker save command. The above strategy could work with multiple file tars that are all different with each other.

what's the best way to use command line applications in a docker environment?

For composing services that are exposed over the network through ports (-p 5000:5000) docker provides all kinds of help (e.g. docker-compose).
What I'm looking for is a way to provide certain binary applications as a service in a docker container. Examples would be:
compiler (e.g. gcc)
generators (e.g. for protocol buffers)
...
running a single command
my current approach is to have a small shell script that passes parameters an takes care of mapping the current directory:
runbox(){
current_dir=${PWD##*/}
docker run --rm -it \
-v ${PWD}:/home/dev/${current_dir} \
-w /home/dev/${current_dir} \
box:latest \
/bin/bash -c "$#"
}
using it looks like this:
runbox "g++ --version"
compared to running locally:
g++ --version
composing multiple commands
Now let's say I have n such service, how do I compose them in a way that all of them can be used in another container (e.g. running on a ci)?
Services running
in Docker containers
+---------------------+
| | +---------+
| CI Docker Container +-----> | gcc |
| | +---------+
| |
| make +-----> +---------+
| | | protoc |
| | +---------+
+----------------+----+
| +---------+
+----------> | clang |
+---------+
what about the overhead of calling a command very often?
As long as the CI gives you access to a docker engine (via socket or port) it shouldn't matter if it's run in a container or not.
If you're using make and running project tasks from a container, you might be interested in dobi. It's a tool designed to something like you're describing. You define the images, containers, mounts, etc that you use to build your project.
You can run each task like you would run a make target (dobi <taskname>). You could have one task for gcc, protoc, clang, etc.

How can I find a Docker image with a specific tag in Docker registry on the Docker command line?

I try to locate one specific tag for a Docker image. How can I do it on the command line? I want to avoid downloading all the images and then removing the unneeded ones.
In the official Ubuntu release, https://registry.hub.docker.com/_/ubuntu/, there are several tags (release for it), while when I search it on the command line,
user#ubuntu:~$ docker search ubuntu | grep ^ubuntu
ubuntu Official Ubuntu base image 354
ubuntu-upstart Upstart is an event-based replacement for ... 7
ubuntufan/ping 0
ubuntu-debootstrap 0
Also in the help of command line search https://docs.docker.com/engine/reference/commandline/search/, no clue how it can work?
Is it possible in the docker search command?
If I use a raw command to search via the Docker registry API, then the information can be fetched:
$ curl https://registry.hub.docker.com//v1/repositories/ubuntu/tags | python -mjson.tool
[
{
"layer": "ef83896b",
"name": "latest"
},
.....
{
"layer": "463ff6be",
"name": "raring"
},
{
"layer": "195eb90b",
"name": "saucy"
},
{
"layer": "ef83896b",
"name": "trusty"
}
]
When using CoreOS, jq is available to parse JSON data.
So like you were doing before, looking at library/centos:
$ curl -s -S 'https://registry.hub.docker.com/v2/repositories/library/centos/tags/' | jq '."results"[]["name"]' |sort
"6"
"6.7"
"centos5"
"centos5.11"
"centos6"
"centos6.6"
"centos6.7"
"centos7.0.1406"
"centos7.1.1503"
"latest"
The cleaner v2 API is available now, and that's what I'm using in the example. I will build a simple script docker_remote_tags:
#!/usr/bin/bash
curl -s -S "https://registry.hub.docker.com/v2/repositories/library/$#/tags/" | jq '."results"[]["name"]' |sort
Enables:
$ ./docker_remote_tags library/centos
"6"
"6.7"
"centos5"
"centos5.11"
"centos6"
"centos6.6"
"centos6.7"
"centos7.0.1406"
"centos7.1.1503"
"latest"
Reference:
jq: https://stedolan.github.io/jq/ | apt-get install jq
I didn't like any of the solutions above because A) they required external libraries that I didn't have and didn't want to install. B) I didn't get all the pages.
The Docker API limits you to 100 items per request. This will loop over each "next" item and get them all (for Python it's seven pages; other may be more or less... It depends)
If you really want to spam yourself, remove | cut -d '-' -f 1 from the last line, and you will see absolutely everything.
url=https://registry.hub.docker.com/v2/repositories/library/redis/tags/?page_size=100 `# Initial url` ; \
( \
while [ ! -z $url ]; do `# Keep looping until the variable url is empty` \
>&2 echo -n "." `# Every iteration of the loop prints out a single dot to show progress as it got through all the pages (this is inline dot)` ; \
content=$(curl -s $url | python -c 'import sys, json; data = json.load(sys.stdin); print(data.get("next", "") or ""); print("\n".join([x["name"] for x in data["results"]]))') `# Curl the URL and pipe the output to Python. Python will parse the JSON and print the very first line as the next URL (it will leave it blank if there are no more pages) then continue to loop over the results extracting only the name; all will be stored in a variable called content` ; \
url=$(echo "$content" | head -n 1) `# Let's get the first line of content which contains the next URL for the loop to continue` ; \
echo "$content" | tail -n +2 `# Print the content without the first line (yes +2 is counter intuitive)` ; \
done; \
>&2 echo `# Finally break the line of dots` ; \
) | cut -d '-' -f 1 | sort --version-sort | uniq;
Sample output:
$ url=https://registry.hub.docker.com/v2/repositories/library/redis/tags/?page_size=100 `#initial url` ; \
> ( \
> while [ ! -z $url ]; do `#Keep looping until the variable url is empty` \
> >&2 echo -n "." `#Every iteration of the loop prints out a single dot to show progress as it got through all the pages (this is inline dot)` ; \
> content=$(curl -s $url | python -c 'import sys, json; data = json.load(sys.stdin); print(data.get("next", "") or ""); print("\n".join([x["name"] for x in data["results"]]))') `# Curl the URL and pipe the JSON to Python. Python will parse the JSON and print the very first line as the next URL (it will leave it blank if there are no more pages) then continue to loop over the results extracting only the name; all will be store in a variable called content` ; \
> url=$(echo "$content" | head -n 1) `#Let's get the first line of content which contains the next URL for the loop to continue` ; \
> echo "$content" | tail -n +2 `#Print the content with out the first line (yes +2 is counter intuitive)` ; \
> done; \
> >&2 echo `#Finally break the line of dots` ; \
> ) | cut -d '-' -f 1 | sort --version-sort | uniq;
...
2
2.6
2.6.17
2.8
2.8.6
2.8.7
2.8.8
2.8.9
2.8.10
2.8.11
2.8.12
2.8.13
2.8.14
2.8.15
2.8.16
2.8.17
2.8.18
2.8.19
2.8.20
2.8.21
2.8.22
2.8.23
3
3.0
3.0.0
3.0.1
3.0.2
3.0.3
3.0.4
3.0.5
3.0.6
3.0.7
3.0.504
3.2
3.2.0
3.2.1
3.2.2
3.2.3
3.2.4
3.2.5
3.2.6
3.2.7
3.2.8
3.2.9
3.2.10
3.2.11
3.2.100
4
4.0
4.0.0
4.0.1
4.0.2
4.0.4
4.0.5
4.0.6
4.0.7
4.0.8
32bit
alpine
latest
nanoserver
windowsservercore
If you want the bash_profile version:
function docker-tags () {
name=$1
# Initial URL
url=https://registry.hub.docker.com/v2/repositories/library/$name/tags/?page_size=100
(
# Keep looping until the variable URL is empty
while [ ! -z $url ]; do
# Every iteration of the loop prints out a single dot to show progress as it got through all the pages (this is inline dot)
>&2 echo -n "."
# Curl the URL and pipe the output to Python. Python will parse the JSON and print the very first line as the next URL (it will leave it blank if there are no more pages)
# then continue to loop over the results extracting only the name; all will be stored in a variable called content
content=$(curl -s $url | python -c 'import sys, json; data = json.load(sys.stdin); print(data.get("next", "") or ""); print("\n".join([x["name"] for x in data["results"]]))')
# Let's get the first line of content which contains the next URL for the loop to continue
url=$(echo "$content" | head -n 1)
# Print the content without the first line (yes +2 is counter intuitive)
echo "$content" | tail -n +2
done;
# Finally break the line of dots
>&2 echo
) | cut -d '-' -f 1 | sort --version-sort | uniq;
}
And simply call it: docker-tags redis
Sample output:
$ docker-tags redis
...
2
2.6
2.6.17
2.8
--trunc----
32bit
alpine
latest
nanoserver
windowsservercore
As far as I know, the CLI does not allow searching/listing tags in a repository.
But if you know which tag you want, you can pull that explicitly by adding a colon and the image name: docker pull ubuntu:saucy
This script (docker-show-repo-tags.sh) should work for any Docker enabled host that has curl, sed, grep, and sort. This was updated to reflect the fact the repository tag URLs changed.
This version correctly parses the "name": field without a JSON parser.
#!/bin/sh
# 2022-07-20
# Simple script that will display Docker repository tags
# using basic tools: curl, awk, sed, grep, and sort.
# Usage:
# $ docker-show-repo-tags.sh ubuntu centos
# $ docker-show-repo-tags.sh centos | cat -n
for Repo in "$#" ; do
URL="https://registry.hub.docker.com/v2/repositories/library/$Repo/tags/"
curl -sS "$URL" | \
/usr/bin/sed -Ee 's/("name":)"([^"]*)"/\n\1\2\n/g' | \
grep '"name":' | \
awk -F: '{printf("'$Repo':%s\n",$2)}'
done
This older version no longer works. Many thanks to #d9k for pointing this out!
#!/bin/sh
# WARNING: This no long works!
# Simple script that will display Docker repository tags
# using basic tools: curl, sed, grep, and sort.
#
# Usage:
# $ docker-show-repo-tags.sh ubuntu centos
for Repo in $* ; do
curl -sS "https://hub.docker.com/r/library/$Repo/tags/" | \
sed -e $'s/"tags":/\\\n"tags":/g' -e $'s/\]/\\\n\]/g' | \
grep '^"tags"' | \
grep '"library"' | \
sed -e $'s/,/,\\\n/g' -e 's/,//g' -e 's/"//g' | \
grep -v 'library:' | \
sort -fu | \
sed -e "s/^/${Repo}:/"
done
This older version no longer works. Many thanks to #viky for pointing this out!
#!/bin/sh
# WARNING: This no long works!
# Simple script that will display Docker repository tags.
#
# Usage:
# $ docker-show-repo-tags.sh ubuntu centos
for Repo in $* ; do
curl -s -S "https://registry.hub.docker.com/v2/repositories/library/$Repo/tags/" | \
sed -e $'s/,/,\\\n/g' -e $'s/\[/\\\[\n/g' | \
grep '"name"' | \
awk -F\" '{print $4;}' | \
sort -fu | \
sed -e "s/^/${Repo}:/"
done
This is the output for a simple example:
$ docker-show-repo-tags.sh centos | cat -n
1 centos:5
2 centos:5.11
3 centos:6
4 centos:6.10
5 centos:6.6
6 centos:6.7
7 centos:6.8
8 centos:6.9
9 centos:7.0.1406
10 centos:7.1.1503
11 centos:7.2.1511
12 centos:7.3.1611
13 centos:7.4.1708
14 centos:7.5.1804
15 centos:centos5
16 centos:centos5.11
17 centos:centos6
18 centos:centos6.10
19 centos:centos6.6
20 centos:centos6.7
21 centos:centos6.8
22 centos:centos6.9
23 centos:centos7
24 centos:centos7.0.1406
25 centos:centos7.1.1503
26 centos:centos7.2.1511
27 centos:centos7.3.1611
28 centos:centos7.4.1708
29 centos:centos7.5.1804
30 centos:latest
I wrote a command line tool to simplify searching Docker Hub repository tags, available in my PyTools GitHub repository. It's simple to use with various command line switches, but most basically:
./dockerhub_show_tags.py repo1 repo2
It's even available as a Docker image and can take multiple repositories:
docker run harisekhon/pytools dockerhub_show_tags.py centos ubuntu
DockerHub
repo: centos
tags: 5.11
6.6
6.7
7.0.1406
7.1.1503
centos5.11
centos6.6
centos6.7
centos7.0.1406
centos7.1.1503
repo: ubuntu
tags: latest
14.04
15.10
16.04
trusty
trusty-20160503.1
wily
wily-20160503
xenial
xenial-20160503
If you want to embed it in scripts, use -q / --quiet to get just the tags, like normal Docker commands:
./dockerhub_show_tags.py centos -q
5.11
6.6
6.7
7.0.1406
7.1.1503
centos5.11
centos6.6
centos6.7
centos7.0.1406
centos7.1.1503
The v2 API seems to use some kind of pagination, so that it does not return all the available tags. This is clearly visible in projects such as python (or library/python). Even after quickly reading the documentation, I could not manage to work with the API correctly (maybe it is the wrong documentation).
Then I rewrote the script using the v1 API, and it is still using jq:
#!/bin/bash
repo="$1"
if [[ "${repo}" != */* ]]; then
repo="library/${repo}"
fi
url="https://registry.hub.docker.com/v1/repositories/${repo}/tags"
curl -s -S "${url}" | jq '.[]["name"]' | sed 's/^"\(.*\)"$/\1/' | sort
The full script is available at: https://github.com/denilsonsa/small_scripts/blob/master/docker_remote_tags.sh
I've also written an improved version (in Python) that aggregates tags that point to the same version: https://github.com/denilsonsa/small_scripts/blob/master/docker_remote_tags.py
Add this function to your .zshrc file or run the command manually:
#usage list-dh-tags <repo>
#example: list-dh-tags node
function list-dh-tags(){
wget -q https://registry.hub.docker.com/v1/repositories/$1/tags -O - | sed -e 's/[][]//g' -e 's/"//g' -e 's/ //g' | tr '}' '\n' | awk -F: '{print $3}'
}
Thanks to this -> How can I list all tags for a Docker image on a remote registry?
For anyone stumbling across this in modern times, you can use Skopeo to retrieve an image's tags from the Docker registry:
$ skopeo list-tags docker://jenkins/jenkins \
| jq -r '.Tags[] | select(. | contains("lts-alpine"))' \
| sort --version-sort --reverse
lts-alpine
2.277.3-lts-alpine
2.277.2-lts-alpine
2.277.1-lts-alpine
2.263.4-lts-alpine
2.263.3-lts-alpine
2.263.2-lts-alpine
2.263.1-lts-alpine
2.249.3-lts-alpine
2.249.2-lts-alpine
2.249.1-lts-alpine
2.235.5-lts-alpine
2.235.4-lts-alpine
2.235.3-lts-alpine
2.235.2-lts-alpine
2.235.1-lts-alpine
2.222.4-lts-alpine
Reimplementation of the previous post, using Python over sed/AWK:
for Repo in $* ; do
tags=$(curl -s -S "https://registry.hub.docker.com/v2/repositories/library/$Repo/tags/")
python - <<EOF
import json
tags = [t['name'] for t in json.loads('''$tags''')['results']]
tags.sort()
for tag in tags:
print "{}:{}".format('$Repo', tag)
EOF
done
For a script that works with OAuth bearer tokens on Docker Hub, try this:
Listing the tags of a Docker image on a Docker hub through the HTTP API
You can use Visual Studio Code to provide autocomplete for available Docker images and tags. However, this requires that you type the first letter of a tag in order to see autocomplete suggestions.
For example, when writing FROM ubuntu it offers autocomplete suggestions like ubuntu, ubuntu-debootstrap and ubuntu-upstart. When writing FROM ubuntu:a it offers autocomplete suggestions, like ubuntu:artful and ubuntu:artful-20170511.1

Resources