Combining multiple Docker images to create build environment - docker

I am the developer of a software product (NJOY) with build requirements of:
CMake 3.2
Python 3.4
gcc 6.2 or clang 3.9
gfortran 5.3+
In reading about Docker, it seems that I should be able to create an image with just these components so that I can compile my code and use it. Much of the documentation is written with the implication that one wants to create a scalable web architecture and thus, doesn’t appear to be applicable to compiled applications like what I’m trying to do. I know it is applicable, I just can’t seem to figure out what to do.
I’m struggling with separating the Docker concept from a Virtual Machine; I can only conceive of compiling my code in an environment that contains an entire OS instead of just the necessary components. I’ve begun a Docker image by starting with an Ubuntu image. This seems to work just fine, but I get the feeling that I’m overly complicating things.
I’ve seen a Docker image for gcc; I’d like to combine it with CMake and Python into an image that we can use. Is this even possible?
What is the right way to approach this?

Combining docker images is not available. Docker images are chained. You start from a base images and you then install additional tools that you want to add on top of the base image.
For instance, you can start from the gcc image and build on it by creating a Dockerfile. Your Dockerfile might look something like:
FROM gcc:latest
# install cmake
RUN apt-get install cmake
# Install python
RUN apt-get install python
Then you build this dockerfile to create the Docker image. This will give you an image that contains gcc, cmake and python.

Related

How to install multiple Tensorflow versions?

I'm trying to run the code from this repository: https://github.com/danielgordon10/thor-iqa-cvpr-2018
It has the following requirements
Python 3.5
CUDA 8 or 9
cuDNN
Tensorflow 1.4 or 1.5
Ubuntu 16.04, 18.04
an installation of darknet
My system satisfies neither of these. I don't want to reinstall tf/cuda/cudnn on my machine (especially if have to do that everytime I try to run deep learning code with different tensorflow requirements everytime).
I'm looking for a way to install the requirements and run the code regardless of the host.
To my knowledge that is exactly what Docker is for.
Looking into this there exist docker images from nvidia. For example one called "nvidia/cuda:9.1-cudnn7-runtime". Based on the name I assumed that any image build with this as the base comes with cuda installed. This does not seem to be the case as if I try to install darknet it will fail with the error that "cuda_runtime.h" is missing.
So what my question basicaly boils down to is: How do I keep multiple different versions of cuda and tensorflow on the same machine ? Ideally with docker (or similar) so I won't have to do the process to many times.
It feels like I'm missing and/or don't understand something obvious, because I can't imagine that it can be so hard to run tensorflow code with different versions without reinstalling things from scratch all the time.

Building Python wheels in Docker for Raspberry Pi Zero on x86_64 machine

I'm hoping this is an appropriate venue for my question. There's a lot of pieces to this puzzle.
I'm building a container using Docker that is destined to run on a Raspberry Pi Zero. The RPi Zero has an ARMv6 hard-float processor. The container will run a Python program that includes some dependencies that must be compiled (uses binary libraries). I am able to build and run the container on the RPi Zero itself, but building the container literally takes hours. I'm hoping to 1) speed up the process of building and 2) allow this to happen in a CI environment.
The approach I've taken in the past to build minimal Python containers that have dependencies requiring compilation is to use a multistage Docker build. I first startup a container with a full toolchain, then run pip wheel to compile all requirements into .whl files. I then copy the .whl files to the final container, install any binary libraries using the typical package manager, and then point pip install at this cache (--find-links=/wheels) for the installation of Python dependencies. This approach also works just fine on the Pi, but as I said it takes forever.
I've considered a few different approaches I could take:
Figure out how to get the Docker engine on my main dev machine (also in CI) to run and build an ARM image using qemu-arm-static while running docker build, and then somehow tag the resulting image as ARMv6 and upload it to my registry somehow. (I could just use a tag or a different repo name) I haven't honestly dug too deep into this, but my main concern is that every example I've seen of qemu-arm seems to indicate that it runs ARMv7 emulation. The RPi Zero can't actually even run many Docker containers that are made available for ARM due to this (exit 139). The arm32v6 "user" does provide working base images that run fine on the RPi Zero, which is what I'm using as the source images to build on my Pi itself.
Emulate an entire RasPi using qemu-system-arm. Again though, it looks like this emulates ARMv7, meaning the compiled wheels might not be able to run on the Pi zero.
Setup a cross-compiling toolchain for ARMv6. A few problems: I wouldn't know how to make sure pip uses that toolchain when compiling, and also I'd need to get and compile any other dependent library (even possibly all the way down to glibc?) so the header files will resolve.
It looks like this is easy to do if you want to do it for ARMv7 (which I believe the RPi 2 uses) or later, but I'm specifically using a Zero for my project, so I don't have that option.
TL;dr: How do I build binary Python wheels for ARMv6 using Docker without having to do it on a slow, single-core Raspberry Pi Zero?

What simple fixed version images I can use with circleCI?

I am trying to use a plain linux image like alpine that is a fixed version
I am trying to find simple images I can use.
I have spent a lot of time on CircleCI's site on pages such as
https://circleci.com/docs/2.0/tutorials/
and
https://circleci.com/docs/2.0/sample-config/#section=configuration
but every configfuration I find seems to be targeted at a specific language or setup. I just want a plain linux shell with bash. Ubuntu seems a bit large, hence me avoiding it. Are there other linux distributions which are a bit lighter but do have tools like bash and curl built in?
I finally found using
- image: alpine:latest
and
command: |
apk add bash curl-dev
echo Hello, world.
worked.
Are there other basic images I could use?
For instance is there a fixed version rather than justv 'latest' so I could ensure it works in the future and I am not surprised by change.

Getting apt-get on an alpine container

I have to install a few dependencies on my docker container, I want to use python:3.6-alpine version to have it as light as possible, but apk package manager which comes with alpine is giving me trouble so I would like to get the apt-get package manager. I tried:
apk add apt-get
and it didnt work.
how can I get it on the container?
Using multiple package systems is usually a very bad idea, for many reasons. Packages are likely to collide and break and you'll end up with much greater mess than you've started with.
See this excellent answer for more detail: Is there a pitfall of using multiple package managers?
A more feasible approach would be troubleshooting and resolving the issues you are having with apk. apk is designed for simplicity and speed, and should take very little getting used to. It is really an excellent package manager, IMO.
For a good tutorial, I warmly recommend the apk introduction page at the Alpine Wiki site:
https://wiki.alpinelinux.org/wiki/Alpine_Linux_package_management
If you're determined not to use apk, and for the sake of experiment want try bringing up apt instead, as a first step, you'll have first to build apt from source: https://github.com/Debian/apt. Then, if it is produces a functional build (not likely since it's probably not compatible with musl libc), you'll have to wire it to some repositories, but Alpine repositories are only fit for apk, not apt. As you can see, this is not really feasible, and not the route you want to go to.

Compile Tensorflow from source with Docker to get CPU speed up

I am looking for a way to set up or modify an existing Docker image for installing tensorflow that will install it such that the SSE4, AVX, AVX2, and FMA instructions can be utilized for CPU speed up. So far I have found how to install from source using bazel How to Compile Tensorflow... and CPU instructions not compiled.... Neither of these explain how to do this within Docker. So I think what I am looking for is what you need to add to an existing docker image that installs without these options so that you can get a compile version of tensorflow with the CPU options enabled. The existing docker images do not do this because they want the image to run on as many machines as possible. I am using Ubuntu 14.04 on linux PC. I am new to docker but have installed tensorflow and have it working without getting the CPU warnings I get when I use the docker images. I may not need this for speed, but I have seen posts that claim the speed up can be significant. I searched for existing docker images that do this and could not find anything. I need this to work with gpu so needs to be compatible with nvidia-docker.
I just found this docker support for bazel and it might provide an answer, however I do not understand it well enough to know for sure. I believe this is saying that you can not build tensorflow with bazel inside a Dockerfile. You have to build a Dockerfile using bazel. Is my understanding correct and is this the only way to get a docker image with tensorflow compiled from source? If so, I could still use help in how to do it and still get the other dependencies that I would get if using an existing docker image for tensorflow.
Dockerfiles that build with CPU support can be found here.
Hope that helps! Spent many a late night here on Stack Overflow and Github Issues and stuff. Now it's my turn to give back! :)
The GPU stuff in particular is really hairy - especially when enabling the XLA/JIT/AOT stuff as well as the Graph Transform Tools.
Lots of hacks embedded in my Dockerfiles. Feel free to review and ask me questions!
The contributing guidelines mention building TensorFlow from source with Docker to run the unit tests:
Refer to the
CPU-only developer Dockerfile and
GPU developer Dockerfile
for the required packages. Alternatively, use the said
Docker images, e.g.,
tensorflow/tensorflow:nightly-devel and tensorflow/tensorflow:nightly-devel-gpu
for development to avoid installing the packages directly on your system.

Resources