I'm trying to run the code from this repository: https://github.com/danielgordon10/thor-iqa-cvpr-2018
It has the following requirements
Python 3.5
CUDA 8 or 9
cuDNN
Tensorflow 1.4 or 1.5
Ubuntu 16.04, 18.04
an installation of darknet
My system satisfies neither of these. I don't want to reinstall tf/cuda/cudnn on my machine (especially if have to do that everytime I try to run deep learning code with different tensorflow requirements everytime).
I'm looking for a way to install the requirements and run the code regardless of the host.
To my knowledge that is exactly what Docker is for.
Looking into this there exist docker images from nvidia. For example one called "nvidia/cuda:9.1-cudnn7-runtime". Based on the name I assumed that any image build with this as the base comes with cuda installed. This does not seem to be the case as if I try to install darknet it will fail with the error that "cuda_runtime.h" is missing.
So what my question basicaly boils down to is: How do I keep multiple different versions of cuda and tensorflow on the same machine ? Ideally with docker (or similar) so I won't have to do the process to many times.
It feels like I'm missing and/or don't understand something obvious, because I can't imagine that it can be so hard to run tensorflow code with different versions without reinstalling things from scratch all the time.
Related
I have an outdated neural network training python2.7 script which utilizes keras 2.1 on top of tensorflow 1.4; and I want it to be trained on my nVidia GPU; and I have CUDA SDK 10.2 installed on Linux. I thought Docker Hub is exactly for publishing frozen software packages which just work, but it seems there is no way to find a container with specific software set.
I know docker >=19.3 has native gpu support, and that nvidia-docker utility has cuda-agnostic layer; but the problem is i cannot install both keras-gpu and tensorflow-gpu of required versions, cannot find wheels, and this legacy script does not work with other versions.
Where did you get the idea tha Docker Hub hosts images with all possible library combinations?
If you cannot find an image that suits you, simply build your own.
I'm hoping this is an appropriate venue for my question. There's a lot of pieces to this puzzle.
I'm building a container using Docker that is destined to run on a Raspberry Pi Zero. The RPi Zero has an ARMv6 hard-float processor. The container will run a Python program that includes some dependencies that must be compiled (uses binary libraries). I am able to build and run the container on the RPi Zero itself, but building the container literally takes hours. I'm hoping to 1) speed up the process of building and 2) allow this to happen in a CI environment.
The approach I've taken in the past to build minimal Python containers that have dependencies requiring compilation is to use a multistage Docker build. I first startup a container with a full toolchain, then run pip wheel to compile all requirements into .whl files. I then copy the .whl files to the final container, install any binary libraries using the typical package manager, and then point pip install at this cache (--find-links=/wheels) for the installation of Python dependencies. This approach also works just fine on the Pi, but as I said it takes forever.
I've considered a few different approaches I could take:
Figure out how to get the Docker engine on my main dev machine (also in CI) to run and build an ARM image using qemu-arm-static while running docker build, and then somehow tag the resulting image as ARMv6 and upload it to my registry somehow. (I could just use a tag or a different repo name) I haven't honestly dug too deep into this, but my main concern is that every example I've seen of qemu-arm seems to indicate that it runs ARMv7 emulation. The RPi Zero can't actually even run many Docker containers that are made available for ARM due to this (exit 139). The arm32v6 "user" does provide working base images that run fine on the RPi Zero, which is what I'm using as the source images to build on my Pi itself.
Emulate an entire RasPi using qemu-system-arm. Again though, it looks like this emulates ARMv7, meaning the compiled wheels might not be able to run on the Pi zero.
Setup a cross-compiling toolchain for ARMv6. A few problems: I wouldn't know how to make sure pip uses that toolchain when compiling, and also I'd need to get and compile any other dependent library (even possibly all the way down to glibc?) so the header files will resolve.
It looks like this is easy to do if you want to do it for ARMv7 (which I believe the RPi 2 uses) or later, but I'm specifically using a Zero for my project, so I don't have that option.
TL;dr: How do I build binary Python wheels for ARMv6 using Docker without having to do it on a slow, single-core Raspberry Pi Zero?
I am the developer of a software product (NJOY) with build requirements of:
CMake 3.2
Python 3.4
gcc 6.2 or clang 3.9
gfortran 5.3+
In reading about Docker, it seems that I should be able to create an image with just these components so that I can compile my code and use it. Much of the documentation is written with the implication that one wants to create a scalable web architecture and thus, doesn’t appear to be applicable to compiled applications like what I’m trying to do. I know it is applicable, I just can’t seem to figure out what to do.
I’m struggling with separating the Docker concept from a Virtual Machine; I can only conceive of compiling my code in an environment that contains an entire OS instead of just the necessary components. I’ve begun a Docker image by starting with an Ubuntu image. This seems to work just fine, but I get the feeling that I’m overly complicating things.
I’ve seen a Docker image for gcc; I’d like to combine it with CMake and Python into an image that we can use. Is this even possible?
What is the right way to approach this?
Combining docker images is not available. Docker images are chained. You start from a base images and you then install additional tools that you want to add on top of the base image.
For instance, you can start from the gcc image and build on it by creating a Dockerfile. Your Dockerfile might look something like:
FROM gcc:latest
# install cmake
RUN apt-get install cmake
# Install python
RUN apt-get install python
Then you build this dockerfile to create the Docker image. This will give you an image that contains gcc, cmake and python.
I am looking for a way to set up or modify an existing Docker image for installing tensorflow that will install it such that the SSE4, AVX, AVX2, and FMA instructions can be utilized for CPU speed up. So far I have found how to install from source using bazel How to Compile Tensorflow... and CPU instructions not compiled.... Neither of these explain how to do this within Docker. So I think what I am looking for is what you need to add to an existing docker image that installs without these options so that you can get a compile version of tensorflow with the CPU options enabled. The existing docker images do not do this because they want the image to run on as many machines as possible. I am using Ubuntu 14.04 on linux PC. I am new to docker but have installed tensorflow and have it working without getting the CPU warnings I get when I use the docker images. I may not need this for speed, but I have seen posts that claim the speed up can be significant. I searched for existing docker images that do this and could not find anything. I need this to work with gpu so needs to be compatible with nvidia-docker.
I just found this docker support for bazel and it might provide an answer, however I do not understand it well enough to know for sure. I believe this is saying that you can not build tensorflow with bazel inside a Dockerfile. You have to build a Dockerfile using bazel. Is my understanding correct and is this the only way to get a docker image with tensorflow compiled from source? If so, I could still use help in how to do it and still get the other dependencies that I would get if using an existing docker image for tensorflow.
Dockerfiles that build with CPU support can be found here.
Hope that helps! Spent many a late night here on Stack Overflow and Github Issues and stuff. Now it's my turn to give back! :)
The GPU stuff in particular is really hairy - especially when enabling the XLA/JIT/AOT stuff as well as the Graph Transform Tools.
Lots of hacks embedded in my Dockerfiles. Feel free to review and ask me questions!
The contributing guidelines mention building TensorFlow from source with Docker to run the unit tests:
Refer to the
CPU-only developer Dockerfile and
GPU developer Dockerfile
for the required packages. Alternatively, use the said
Docker images, e.g.,
tensorflow/tensorflow:nightly-devel and tensorflow/tensorflow:nightly-devel-gpu
for development to avoid installing the packages directly on your system.
I have been developing in Tensorflow/Python on OSX. Trying to graduate to the big leagues. Bought a big new GPU PC, installed linux, CUDA, Docker, Tensrflow. (Pulled out lots of hair in the process). I thought that Docker-Tensorflow would provide a linux VM environment with Tensorflow, in which I could run my IDE and work from the cmd line like before, but it just seems to serve Jupyter notebooks. I've found some posts with what seem like heroic measures to develop in Docker. I suspect that Docker-Tensorflow is just meant for running demos, serving Jupyter notebooks, etc., and that for development I should revert to a conventional Tensorflow installation. Can someone please confirm (or deny) this? Thanks!
Yes, I share your opinion. Instead I use Anaconda (prior pyenv and virtualenv) to maintain environments and (local) dependencies. In detail tensorflow itself has just a few dependencies.