I am trying to containerize my ROS + tensorflow application. The problem is that I want to use GPU. I can either use GPU and forget about ROS, or use ROS and forget about GPU, but I don't know how I can enable both of them.
I have tried starting FROM a cuda image and installing ROS as described here, but the ros couldn't be installed, not being able to find the package.
I also tried building them one by one in the same Dockerfile and copying all the ROS-related stuff from a ROS build, but that failed too.
Ideally, I want to make it work as if I "include" a Cuda image and a ROS image, but resource on building from multiple images like this could hardly be found. Any help or pointer would be appreciated.
Related
I am using this spark image from datamechanics, and am assuming that the image has hadoop installed because the name says so. But I can find it in the usual locations (/usr/local, /opt/, etc). Also the docs are not easy to read to understand how the image was built (mostly because I couldn't find a code like file which I can read). Does anyone know if hadoop is actually installed in the datamechanics images. If not, is there an alternate image that is recommended for spark and hadoop? Thanks!
The name of the image is a reference to the Spark downloadable package that includes Spark 3.1.1 with Hadoop 3.2.0 client libraries, as compared to the "bring your own" version...
It does not "come with" scripts to run HDFS and YARN, therefore does have have either "installed". You can search Dockerhub for datanode/namenode images, but neither are necessary to run Spark from the image you've found.
I'm trying to run the code from this repository: https://github.com/danielgordon10/thor-iqa-cvpr-2018
It has the following requirements
Python 3.5
CUDA 8 or 9
cuDNN
Tensorflow 1.4 or 1.5
Ubuntu 16.04, 18.04
an installation of darknet
My system satisfies neither of these. I don't want to reinstall tf/cuda/cudnn on my machine (especially if have to do that everytime I try to run deep learning code with different tensorflow requirements everytime).
I'm looking for a way to install the requirements and run the code regardless of the host.
To my knowledge that is exactly what Docker is for.
Looking into this there exist docker images from nvidia. For example one called "nvidia/cuda:9.1-cudnn7-runtime". Based on the name I assumed that any image build with this as the base comes with cuda installed. This does not seem to be the case as if I try to install darknet it will fail with the error that "cuda_runtime.h" is missing.
So what my question basicaly boils down to is: How do I keep multiple different versions of cuda and tensorflow on the same machine ? Ideally with docker (or similar) so I won't have to do the process to many times.
It feels like I'm missing and/or don't understand something obvious, because I can't imagine that it can be so hard to run tensorflow code with different versions without reinstalling things from scratch all the time.
I have an outdated neural network training python2.7 script which utilizes keras 2.1 on top of tensorflow 1.4; and I want it to be trained on my nVidia GPU; and I have CUDA SDK 10.2 installed on Linux. I thought Docker Hub is exactly for publishing frozen software packages which just work, but it seems there is no way to find a container with specific software set.
I know docker >=19.3 has native gpu support, and that nvidia-docker utility has cuda-agnostic layer; but the problem is i cannot install both keras-gpu and tensorflow-gpu of required versions, cannot find wheels, and this legacy script does not work with other versions.
Where did you get the idea tha Docker Hub hosts images with all possible library combinations?
If you cannot find an image that suits you, simply build your own.
I am the developer of a software product (NJOY) with build requirements of:
CMake 3.2
Python 3.4
gcc 6.2 or clang 3.9
gfortran 5.3+
In reading about Docker, it seems that I should be able to create an image with just these components so that I can compile my code and use it. Much of the documentation is written with the implication that one wants to create a scalable web architecture and thus, doesn’t appear to be applicable to compiled applications like what I’m trying to do. I know it is applicable, I just can’t seem to figure out what to do.
I’m struggling with separating the Docker concept from a Virtual Machine; I can only conceive of compiling my code in an environment that contains an entire OS instead of just the necessary components. I’ve begun a Docker image by starting with an Ubuntu image. This seems to work just fine, but I get the feeling that I’m overly complicating things.
I’ve seen a Docker image for gcc; I’d like to combine it with CMake and Python into an image that we can use. Is this even possible?
What is the right way to approach this?
Combining docker images is not available. Docker images are chained. You start from a base images and you then install additional tools that you want to add on top of the base image.
For instance, you can start from the gcc image and build on it by creating a Dockerfile. Your Dockerfile might look something like:
FROM gcc:latest
# install cmake
RUN apt-get install cmake
# Install python
RUN apt-get install python
Then you build this dockerfile to create the Docker image. This will give you an image that contains gcc, cmake and python.
I am looking for a way to set up or modify an existing Docker image for installing tensorflow that will install it such that the SSE4, AVX, AVX2, and FMA instructions can be utilized for CPU speed up. So far I have found how to install from source using bazel How to Compile Tensorflow... and CPU instructions not compiled.... Neither of these explain how to do this within Docker. So I think what I am looking for is what you need to add to an existing docker image that installs without these options so that you can get a compile version of tensorflow with the CPU options enabled. The existing docker images do not do this because they want the image to run on as many machines as possible. I am using Ubuntu 14.04 on linux PC. I am new to docker but have installed tensorflow and have it working without getting the CPU warnings I get when I use the docker images. I may not need this for speed, but I have seen posts that claim the speed up can be significant. I searched for existing docker images that do this and could not find anything. I need this to work with gpu so needs to be compatible with nvidia-docker.
I just found this docker support for bazel and it might provide an answer, however I do not understand it well enough to know for sure. I believe this is saying that you can not build tensorflow with bazel inside a Dockerfile. You have to build a Dockerfile using bazel. Is my understanding correct and is this the only way to get a docker image with tensorflow compiled from source? If so, I could still use help in how to do it and still get the other dependencies that I would get if using an existing docker image for tensorflow.
Dockerfiles that build with CPU support can be found here.
Hope that helps! Spent many a late night here on Stack Overflow and Github Issues and stuff. Now it's my turn to give back! :)
The GPU stuff in particular is really hairy - especially when enabling the XLA/JIT/AOT stuff as well as the Graph Transform Tools.
Lots of hacks embedded in my Dockerfiles. Feel free to review and ask me questions!
The contributing guidelines mention building TensorFlow from source with Docker to run the unit tests:
Refer to the
CPU-only developer Dockerfile and
GPU developer Dockerfile
for the required packages. Alternatively, use the said
Docker images, e.g.,
tensorflow/tensorflow:nightly-devel and tensorflow/tensorflow:nightly-devel-gpu
for development to avoid installing the packages directly on your system.