Building tensorflow-serving docker gpu from source with docker? - docker

I have followed these steps for building docker from source,
git clone https://github.com/tensorflow/serving
cd serving
docker build --pull -t $USER/tensorflow-serving-devel-gpu \
-f tensorflow_serving/tools/docker/Dockerfile.devel-gpu .
docker build -t $USER/tensorflow-serving-gpu \
--build-arg TF_SERVING_BUILD_IMAGE=$USER/tensorflow-serving-devel-gpu \
-f tensorflow_serving/tools/docker/Dockerfile.gpu .
These took quite a long time for get compiled and it was completed successfully,
Now if I check docker images, I see this below response,
REPOSITORY TAG IMAGE ID CREATED SIZE
root/tensorflow-serving-gpu latest 42e221bb6bc9 About an hour ago 8.49GB
root/tensorflow-serving-devel-gpu latest 7fd974e5e0c5 2 hours ago 21.8GB
nvidia/cuda 11.0-cudnn8-devel-ubuntu18.04 7c49b261611b 3 months ago 7.41GB
I have two doubts regarding this,
Building from source took a large amount of time, and now I want to backup /save these images or containers and save them so I can re-use them later on different machine with same arch. If you know how to do it, please help me with the commands.
Since I completed the build successfully, I need to free up some space by removing unnecessary docker development images used to build tensorflow-serving-gpu? I have three images here which are related to tensorflow serving and I don't know which one to delete?

If you want to save images:
docker save root/tensorflow-serving-gpu:latest -o tfs.tar
And if you want to load it:
docker load -i tfs.tar
root/tensorflow-serving-gpu and root/tensorflow-serving-devel-gpu are two different images. You can see the differences by looking at the details of Dockerfile.devel-gpu and Dockerfile.gpu.

Related

How to remove all prior builds, containers, repositories, and tags on Docker build?

This is my current docker / docker-compose workflow:
Make some code changes
sudo docker build --rm . -t image_name
Ctrl + C docker-compose process
sudo docker-compose down
sudo docker-compose up
Why then, when I run docker system prune does the space deleted increase in proportion with the number of times I run through this process? Using sudo docker build --rm . -t image_name:
deleted: sha256: blablablah
deleted: sha256: ...
deleted: sha256: blablablah
Total reclaimed space: 1.948GB // Roughly proportional to the number of build runs
I was under the impression that sudo docker build --rm . -t image_name, using --rm, would remove any intermediary images and containers. Or is it just images that get automatically deleted, while past containers persist? If so, is there a way to clean up past containers and images on each Docker build without having to remember to occasionally run docker system prune? Should I set up a job to automate this in development?
EDIT: I do not think my question is a duplicate of this question. When I run sudo docker images, there are no <none> containers. Therefore, the question is not a duplicate of the question, "What are <none> repository and tags? Why do they appear when I use docker build?" However, it does appear a clue to the answer to my question appears to be in the comments:
You built an image with docker build -t myname/NewImage:0.1 . more
than one time. The first time you do this, it creates an image tagged
with myname/NewImage:0.1. The second time you do this, it creates a
new image which gets the myname/NewImage:0.1 tag, leaving the first
image you built having tags of <none>
This comment is accurate, but I do not appear to have any <none> containers. Further, even with this comment, my follow up question of how to avoid this is unanswered:
"...is there a way to clean up past containers and images on each
Docker build without having to remember to occasionally run docker
system prune? "

Correctly keeping docker VSTS / Azure Devops build agent clean yet cached

We have added a dockerised build agent to our development Kubernetes cluster which we use to build our applications as part of our Azure Devops pipelines. We created our own image based on the deprecated Microsoft/vsts-agent-docker on Github.
The build agent uses Docker outside of Docker (DooD) to create images on our development cluster.
This agent was working well for a few days but then an error would occasionally occur on the docker commands in our build pipeline:
Error response from daemon: No such image: fooproject:ci-3284.2
/usr/local/bin/docker failed with return code: 1
We realised that the build agent was creating tons of images that weren't being removed. There were tons of images that were blocking up the build agent and there were missing images, which would explain the "no such image" error message.
By adding a step to our build pipelines with the following command we were able to get our build agent working again:
docker system prune -f -a
But of course this then removes all our images, and they must be built from scratch every time, which causes our builds to take an unnecessarily long time.
I'm sure this must be a solved problem but I haven't been able to locate any documentation on the normal strategy for dealing with a dockerised build agent becoming clogged over time. Being new to docker and kubernetes I may simply not know what I am looking for. What is the best practice for creating a dockerised build agent that stays clean and functional, while maintaining a cache?
EDIT: Some ideas:
Create a build step that cleans up all but the latest image for the given pipeline (this might still clog the build server though).
Have a cron job run that removes all the images every x days (this would result in slow builds the first time after the job is run, and could still clog the build server if it sees heavy usage.
Clear all images nightly and run all builds outside of work hours. This way builds would run quickly during the day. However heavy usage could still clog the build server.
EDIT 2:
I found someone with a docker issue on Github that seems to be trying to do exactly the same thing as me. He came up with a solution which he described as follows:
I was exactly trying to figure out how to remove "old" images out of my automated build environment without removing my build dependencies. This means I can't just remove by age, because the nodejs image might not change for weeks, while my app builds can be worthless in literally minutes.
docker image rm $(docker image ls --filter reference=docker --quiet)
That little gem is exactly what I needed. I dropped my repository name in the reference variable (not the most self-explanatory.) Since I tag both the build number and latest the docker image rm command fails on the images I want to keep. I really don't like using daemon errors as a protection mechanism, but its effective.
Trying to follow these directions, I have applied the latest tag to everything that is built during the process, and then run
docker image ls --filter reference=fooproject
If I try to remove these I get the following error:
Error response from daemon: conflict: unable to delete b870ec9c12cc (must be forced) - image is referenced in multiple repositories
Which prevents the latest one from being removed. However this is not exactly a clean way of doing this. There must be a better way?
Probably you've already found a solution, but it might be useful for the rest of the community to have an answer here.
docker prune has a limited purpose. It was created to address the issue with cleaning up all local Docker images. (As it was mentioned by thaJeztah here)
To remove images in the more precise way it's better to divide this task into two parts:
1. select/filter images to delete
2. delete the list of selected images
E.g:
docker image rm $(docker image ls --filter reference=docker --quiet)
docker image rm $(sudo docker image ls | grep 1.14 | awk '{print $3}')
docker image ls --filter reference=docker --quiet | xargs docker image rm
It is possible to combine filters clauses to get exactly what you what:
(I'm using Kubernetes master node as an example environment)
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
k8s.gcr.io/kube-proxy v1.14.2 5c24210246bb 3 months ago 82.1MB
k8s.gcr.io/kube-apiserver v1.14.2 5eeff402b659 3 months ago 210MB
k8s.gcr.io/kube-controller-manager v1.14.2 8be94bdae139 3 months ago 158MB
k8s.gcr.io/kube-scheduler v1.14.2 ee18f350636d 3 months ago 81.6MB # before
quay.io/coreos/flannel v0.11.0-amd64 ff281650a721 6 months ago 52.6MB
k8s.gcr.io/coredns 1.3.1 eb516548c180 7 months ago 40.3MB # since
k8s.gcr.io/etcd 3.3.10 2c4adeb21b4f 8 months ago 258MB
k8s.gcr.io/pause 3.1 da86e6ba6ca1 20 months ago 742kB
$ docker images --filter "since=eb516548c180" --filter "before=ee18f350636d"
REPOSITORY TAG IMAGE ID CREATED SIZE
quay.io/coreos/flannel v0.11.0-amd64 ff281650a721 6 months ago 52.6MB
$ docker images --filter "since=eb516548c180" --filter "reference=quay.io/coreos/flannel"
REPOSITORY TAG IMAGE ID CREATED SIZE
quay.io/coreos/flannel v0.11.0-amd64 ff281650a721 6 months ago 52.6MB
$ docker images --filter "since=eb516548c180" --filter "reference=quay*/*/*"
REPOSITORY TAG IMAGE ID CREATED SIZE
quay.io/coreos/flannel v0.11.0-amd64 ff281650a721 6 months ago 52.6MB
$ docker images --filter "since=eb516548c180" --filter "reference=*/*/flan*"
REPOSITORY TAG IMAGE ID CREATED SIZE
quay.io/coreos/flannel v0.11.0-amd64 ff281650a721 6 months ago 52.6MB
As mentioned in the documentation, images / image ls filter is much better than docker prune filter, which supports until clause only:
The currently supported filters are:
• dangling (boolean - true or false)
• label (label=<key> or label=<key>=<value>)
• before (<image-name>[:<tag>], <image id> or <image#digest>) - filter images created before given id or references
• since (<image-name>[:<tag>], <image id> or <image#digest>) - filter images created since given id or references
If you need more than one filter, then pass multiple flags
(e.g., --filter "foo=bar" --filter "bif=baz")
You can use other linux cli commands to filter docker images output:
grep "something" # to include only specified images
grep -v "something" # to exclude images you want to save
sort [-k colN] [-r] [-g]] | head/tail -nX # to select X oldest or newest images
Combining them and putting the result to CI/CD pipeline allows you to leave only required images in the local cache without collecting a lot of garbage on your build server.
I've copied here a good example of using that approach provided by strajansebastian in the comment:
#example of deleting all builds except last 2 for each kind of image
#(the image kind is based on the Repository value.)
#If you want to preserve just last build modify to tail -n+2.
# delete dead containers
docker container prune -f
# keep last 2 builds for each image from the repository
for diru in `docker images --format "{{.Repository}}" | sort | uniq`; do
for dimr in `docker images --format "{{.ID}};{{.Repository}}:{{.Tag}};'{{.CreatedAt}}'" --filter reference="$diru" | sed -r "s/\s+/~/g" | tail -n+3`; do
img_tag=`echo $dimr | cut -d";" -f2`;
docker rmi $img_tag;
done;
done
# clean dangling images if any
docker image prune -f

Why I can't run a newly created Docker image?

I created two new images anubh_custom_build_image/ubuntu_bionic:version1 & ubuntu_bionic_mldev:version1 from the base image ubuntu:bionic. The purpose of creating the custom-built ubuntu-docker images was to use Linux system on windows platform. I have faced many issues in past one such is installing a new version of tensorflow library ! pip install -q tf-nightly, I can't find a substitute of ! to run this command on windows cmd-prompt/PowerShell. Moreover, I want to invest more time on my codebase rather than fixing the issues on different OS. So, I pull the latest Ubuntu image from docker, installed a bunch of libraries for my usage and committed using docker commit command:
docker commit 503130713dff ubuntu_bionic_MLdev:version1
I can see the images using :
PS C:\Users\anubh> docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
ubuntu_bionic_mldev version1 e7d1b154b69f 21 hours ago 9.33GB
anubh_custom_build_image/ubuntu_bionic version1 3c98f8954731 22 hours ago 9.33GB
tensorflow/tensorflow latest 2c8d1fd8bde4 2 days ago 1.25GB
ubuntu bionic 735f80812f90 2 weeks ago 83.5MB
ubuntu latest 735f80812f90 2 weeks ago 83.5MB
floydhub/dl-docker cpu 0b9fc622f1b7 2 years ago 2.87GB
When I tried to spin-up the containers using these images, The following command ran without any error.
PS C:\Users\anubh> docker run anubh_custom_build_image/ubuntu_bionic:version1
PS C:\Users\anubh> docker run ubuntu_bionic_mldev:version1
EDIT:
The issue is that run command is executing but the containers aren't spinning up for the above two images. I apologize for attaching the wrong error message in the first post, I edited it now. The below two containers were spined-up using docker run -it -p 8888:8888 tensorflow/tensorflow & docker run ubuntu:bionic commands.
PS C:\Users\anubh> docker container ls
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
94d59b217b70 tensorflow/tensorflow "/run_jupyter.sh --a…" 21 hours ago Up 21 hours 6006/tcp, 8888/tcp boring_clarke
503130713dff ubuntu:bionic "bash" 38 hours ago Up 38 hours awesome_bardeen
Could anyone suggest what I am missing for running these images anubh_custom_build_image/ubuntu_bionic:version1 & ubuntu_bionic_mldev:version1 from the base image ubuntu:bionic as
containers?
Also, I can't find the location of any of these images on my disk.
Could anyone also suggest where to look for inside Windows OS?
NOTE: I will write a dockerfile in future to build a custom image, but for now, I want to use the commit command to create new image & use them.
Your docker run command doesn't work because you don't have the :version1 tag at the end of it. (Your question claims you do, but the actual errors you quote don't have it.)
However: if this was anything more than a simple typo, the community would probably be unable to help you, because you have no documentation about what went into your image. (And if not "the community", then "you, six months later, when you discover your image has a critical security vulnerability".) Learn how the Dockerfile system works and use docker build to build your custom images. Best-practice workflows tend to not use docker commit at all.

Can I obtain the Docker layer history on non-final stage Docker builds?

I'm working out a way to do Docker layer caching in CircleCI, and I've got a working solution. However, I am trying to improve it. The problem in any form of CI is that the image history is wiped for every build, so one needs to work out what files to restore, using the CI system's caching directives, and then what to load back into Docker.
First I tried this, inspired by this approach on Travis. To restore:
if [ -f /caches/${CIRCLE_PROJECT_REPONAME}.tar.gz ]; then gunzip -c /caches/${CIRCLE_PROJECT_REPONAME}.tar.gz | docker load; docker images; fi
And to create:
docker save $(docker history -q ${CIRCLE_PROJECT_REPONAME}:latest | grep -v '<missing>') | gzip > /caches/${CIRCLE_PROJECT_REPONAME}.tar.gz
This seemed to work OK, but my Dockerfile uses a two-stage build, and as soon as I COPYed files from the first to the final, it stopped referencing the cache. I assume this is because (a) docker history only applies to the final build, and (b) the non-cached changes in the first build stage have a new mtime, and so when they are copied to the final stage, they are regarded as new.
To get around this problem, I decided to try saving all images to the cache:
docker save $(docker images -a -q) | gzip > /caches/${CIRCLE_PROJECT_REPONAME}.tar.gz
This worked! However, it has a new problem: when I modify my Dockerfile, the old image cache will be loaded, new images will be added, and then everything will be stored in the cache. This will accumulate dead layers I will never need again, presumably until the CI provider's cache size limits are hit.
I think this can be fixed by caching all the stages of the build, but I am not sure how to reference the first stage. Is there a command I can run, similar to docker history -q -a, that will give me the hashes either for all non-last stages (since I can do the last one already) or for all stages including the last stage?
I was hoping docker build -q might do that, but it only prints the final hash, not all intermediate hashes.
Update
I have an inelegant solution, which does work, but there is surely a better way than this! I search the output of docker build for --->, which is Docker's way of announcing layer hashes and cache information. I strip out cache messages and arrows, leaving just the complete build layer hash list for all build stages:
docker build -t imagename . | grep '\-\-\->' | grep -v 'Using cache' | sed -e 's/[ >-]//g'
(I actually do the build twice - once for the build CI step proper, and a second time to gather the hashes. I could do it just once, but it feels nice to have the actual build in a separate step. The second build will always be cached, and will only take a few seconds to run).
Can this be improved upon, perhaps using Docker commands?
This is a summary of a conversation in the comments.
One option is to push all build stages to a remote. If there are two build stages, with the first one being named build and the second one unnamed, then one can do this:
docker build --target build --tag image-name-build .
docker build --tag image-name .
One can then push image-name (the final build artifact) and image-name-build (the first stage, which is normally thrown away) to a remote registry.
When rebuilding images, one can pull both of these onto the fresh CI build machine, and then do:
docker build --cache-from image-name-build --target build --tag image-name-build .
docker build --cache-from image-name --tag image-name .
As BMitch says, the --cache-from will indicate that the images can be trusted for the purposes of using them as a local layer cache.
Comparison
The temporary solution in the question is good if you have a CI-native cache system to store files in, and you would rather not clutter up your registry with intermediate build stage images that are normally thrown away.
The --cache-from solution is nice because it is tidier, and uses Docker-native features rather than having to grep build output. It will also be very useful if your CI solution does not provide a file caching system, since it uses a remote registry instead.

How do I use the "git-like" capabilities of Docker?

I'm specifically interested in how I run or roll back to a specific version of the (binary) image from docker, and have tried to clarify this question to that effect.
The Docker FAQ it says:
Docker includes git-like capabilities for tracking successive versions of a container, inspecting the diff between versions, committing new versions, rolling back etc. The history also includes how a container was assembled and by whom, so you get full traceability from the production server all the way back to the upstream developer.
Google as I may, I can find no example of "rolling back" to an earlier container, inspecting differences, etc. (Obviously I can do such things for version-managed Dockerfiles, but the binary Docker image / container can change even when the Dockerfile does not, due to updated software sources, and I'm looking for a way to see and roll back between such changes).
For a basic example: imagine I run
docker build -t myimage .
on a Dockerfile that just updates the base ubuntu:
FROM ubuntu:14:04
RUN apt-get update -q && apt-get upgrade -y
If I build this same image a few days later, how can I diff these images to see what packages have been upgraded? How can I roll back to the earlier version of the image after re-running the same build command later?
Technically we are only rolling back AUFS layers, not necessarily rolling back history. If our workflow consists of interactively modifying our container and committing the changes with docker commit, then this really does roll back history in the sense that it removes any package updates we applied in later layers, leaving the versions installed in earlier layers. This is very different if we rebuild image from a Dockerfile. Then nothing here is letting us get back to the previous version we had built, we can only remove steps (layers) from the Dockerfile. In other words, we can only roll back the history of our docker commits to an image.
It appears the key to rolling back to an earlier version of a docker image is simply to point the docker tag at an earlier hash.
For instance, consider inspecting the history of the standard ubuntu:latest image:
docker history ubuntu:latest
Shows:
IMAGE CREATED CREATED BY SIZE
ba5877dc9bec 3 weeks ago /bin/sh -c #(nop) CMD [/bin/bash] 0 B
2318d26665ef 3 weeks ago /bin/sh -c sed -i 's/^#\s*\(deb.*universe\)$/ 1.903 kB
ebc34468f71d 3 weeks ago /bin/sh -c rm -rf /var/lib/apt/lists/* 8 B
25f11f5fb0cb 3 weeks ago /bin/sh -c echo '#!/bin/sh' > /usr/sbin/polic 194.5 kB
9bad880da3d2 3 weeks ago /bin/sh -c #(nop) ADD file:de2b0b2e36953c018c 192.5 MB
511136ea3c5a 14 months ago
0 B
Imagine we want to go back to the image indicated by hash 25f:
docker tag 25f ubuntu:latest
docker history ubuntu:latest
And we see:
IMAGE CREATED CREATED BY SIZE
25f11f5fb0cb 3 weeks ago /bin/sh -c echo '#!/bin/sh' > /usr/sbin/polic 194.5 kB
9bad880da3d2 3 weeks ago /bin/sh -c #(nop) ADD file:de2b0b2e36953c018c 192.5 MB
511136ea3c5a 14 months ago 0 B
Of course, we probably never want to roll back in this way, since it makes ubuntu:latest not actually the most recent ubuntu in our local library. Note that we could have used any tag we wanted, e.g.
docker tag 25f ubuntu:notlatest
or simply launched the old image by hash:
docker run -it 25f /bin/bash
So simple and yet so neat. Note that we can combine this with docker inspect to get a bit more detail about the metadata of each image to which the Docker FAQ refers.
Also note that docker diff and docker commit are rather unrelated to this process, as they refer to containers (e.g. running images), not to images directly. That is, if we run an image interactively and then add or change a file on the image, we can see the change (between container) by using docker diff <Container-id> and commit the change using docker commit <Container id>.
I'm not sure if you actually can use the hash as a tag. The hash IIRC is a reference to the image itself whereas the tag is more of a metadata field on the image.
The tags feature imho is quite badly documented, but the way you should use it is probably by using semantic versioning of sorts to organise your tags and images. We're moving a complex (12-microservice) system to using Docker and from relying on latest I quickly ended up doing something like semantic versioning and a changelog in the Git Repo to keep track of changes.
This can also be good if you say, have a docker branch that automatically takes changes and triggers a build on DockerHub - you can update a changelog and know which hash/timestamp goes with what.
Personally as the DockerHub build triggers are slow at present I prefer manually declaring a tag for each image and keeping a changelog, but YMMV and I suspect the tools will get better for this.

Resources