I'm getting started with my Jetson Nano and I'm looking for an example that I can launch by running docker run xxx where xxx is some image at DockerHub that uses the GPU.
I assume I'll have to pass in some --device flags, but is there even any kind of "hello world"-style sample ready to go that uses the GPU from docker?
I'm hoping to just demonstrate that you can access the GPU from a docker container on the Jetson Nano. Mostly to make sure that my configuration is correct.
Nvidia Jetpack 4.2.1 enabled easy Docker with GPU on Jetson Nano.
See here for detailed instruction on how to get Docker and Kubernetes running on Jetson Nano with GPU:
https://medium.com/jit-team/building-a-gpu-enabled-kubernets-cluster-for-machine-learning-with-nvidia-jetson-nano-7b67de74172a
It uses a simple Docker Hub hosted image for TensorFlow:
You're not alone in wanting that, but you cannot do it at the moment. The NVIDIA Nano team are aware of the need, and the feature is expected later this year.
See https://devtalk.nvidia.com/default/topic/1050809/jetson-nano/docker-image-to-see-if-cuda-is-working-in-container-on-jetson-nano-/
At present you can run a docker container with TensorFlow or PyTorch installed. but it will only use the CPU, not the GPU.
Related
firstly, I'm still beginner in docker.
I need to run multiple version of TensorFlow and each version requires a specific cuda version.
my host operating system is ubuntu 16.04
I need to have multiple versions of cuda on my OS since I'm working on multiple projects each requires a different versions of cuda. I tried to use conda and virtual environments to solve that problem. after a while I gave up and started to search for alternatives.
apparently virtual machines can't access GPU, only if you own a specif gpu type you can run the official NVIDIA visualizer.
I have a NVIDIA 1080 gpu. I installed a new image of Ubuntu 16.04 and started to work on dockerfiles to create custom images for my projects.
I was trying to avoid using docker to avoid complexity,after I failed in installing and running multiple versions of cuda I turned to docker. apparently you can't access cuda via docker directly if you don't install the cuda driver on the host machine.
I'm still not sure if I could run docker containers with a different cuda version than the one I installed in my pc.
if that is the case, NVIDIA messed up big time. usually if their is no need to use docker we avoid it to overcome additional complexities. when we need to work with multiple environments, and conda and virtual environment fail. we head out towards docker. so If nvidia limits the usage in docker container to one cuda version, they only intended to allow developers to work on one project of special dependencies per operating system.
please confirm if I can run containers that each have a specific cuda versions.
Moreover I will greatly appreciate if someone point out to a guide on how to use conda enviroments to build docker files and how to run conda env in docker container.
Having several CUDA versions is possible with Docker. Moreover, none of them needs to be at your host machine, you can have CUDA in a container and that's IMO is the best place for it.
To enable GPU support in container and make use of CUDA in it you need to have all of these installed:
Docker
(optionally but recommended) docker-compose
NVIDIA Container Toolkit
NVIDIA GPU Driver
Once you've obtained these you can simply grab one of the official tensorflow images (if the built-in python version fit your needs), install pip packages and start working in minutes. CUDA is included in the container image, you don't need it on host machine.
Here's an example docker-compose.yml to start a container with tensorflow-gpu. All the container does is a test whether any of GPU devices available.
version: "2.3" # the only version where 'runtime' option is supported
services:
test:
image: tensorflow/tensorflow:2.3.0-gpu
# Make Docker create the container with NVIDIA Container Toolkit
# You don't need it if you set 'nvidia' as the default runtime in
# daemon.json.
runtime: nvidia
# the lines below are here just to test that TF can see GPUs
entrypoint:
- /usr/local/bin/python
- -c
command:
- "import tensorflow as tf; tf.test.is_gpu_available(cuda_only=False, min_cuda_compute_capability=None)"
Running this with docker-compose up you should see a line with GPU specs in it. It looks like this and appears at the end:
test_1 | 2021-01-23 11:02:46.500189: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/device:GPU:0 with 1624 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1)
What are the differences between the official PyTorch image on Docker Hub and the PyTorch image on NVIDIA NGC?
The NGC page is more documented than the Docker Hub page, which has no description. But the NGC image is also heavier by a few gigabytes, and it seems to require a CUDA 10.2-compatible driver.
Is there any advantage in using the NVIDIA NGC image instead of the one from Docker Hub?
I'm not a docker expert.
To the best of my knowledge Nvidia puts a lot of effort to ship GPU-optimized containers such that running GPU pytorch on Nvidia GPU with Nvidia container should have best possible performance.
Therefore, if you are using Nvidia hardware you should expect better performance using NGC containers. The gap, though, might not be that significant.
I am trying to run docker in docker on a gce but to no avail.
I have used similar setup on a host and it worked fine.
However, when try it with --create-with-container I get:
Segmentation fault (core dumped)
Segmentation fault (core dumped)
I have docker installed in the image. It works fine when the image is run on a normal host.
Here is how I am trying to do it:
gcloud compute instances \
create-with-container docker-in-docker-on-cge \
--container-restart-policy=never \
--container-privileged \
--container-mount-host-path=host-path=/var/run/docker.sock,mount-path=/var/run/docker.sock \
--container-mount-host-path=host-path=/usr/bin/docker,mount-path=/usr/bin/docker \
--container-image=$MYIMAGE
Do you think this is possible at all, and if yes, what should I do?
Thanks
When you use a command gcloud compute instances create-with-container to create a GCE VM instance running a container image, a Container-Optimized OS (COS) is deployed. This is operating system optimized for running Docker containers. But it lacks many of components you had in a typical Linux distribution.
It has a number of limitations, for instance the COS kernel is locked down; you are unable to install third-party kernel modules or drivers. Containerized applications that depend on kernel modules, drivers and other additional packages that are not available in COS might not work. This is kind of a lockdown environment with a small attack surface, that runs your containers as safe as possible.
For more details please see Container-Optimized OS Overview
It's unlikely that Docker-in-Docker configuration in COS is supported by Google.
Apart from that there are good explanations of why running a nested Docker configuration is troublesome and what a workaround could be:
Is it ok to run docker from inside docker?
Docker in Docker?
Both listed above are based on the original article of the author of the Docker-in-Docker feature Jérôme Petazzoni: Using Docker-in-Docker for your CI or testing environment? Think twice.
I am trying to get gpu support on my container
without the nvidia-docker
I know with the nvidia docker, I just have to use
--runtime=nvidia but my current circumstances does not allow using nvidia-docker
I tried installing the nvidia driver, cuda, cudnn on my container but it fails.
How can I use tensorflow gpu without nvidia docker on my container?
You can use x11docker
Running a docker image on X with gpu is as simple as
x11docker --gpu imagename
You'll be happy to know that the latest Docker version now comes with support for nvidia GPU's. You'll need to use --device flag to specify your Nvidia driver. See - How to use GPU a docker container
Earlier, you had to install nvidia-docker which was plain docker with a thin layer of abstraction for nvidia GPU's. See - Nvidia Docker
You cannot simply install nvidia drivers in a docker container. The container must have access to the hardware. Though I'm not certain, but mounts might help you with that issue. See- https://docs.docker.com/storage/
You can use anaconda to install and use Tensorflow-gpu.
Make sure you have the latest nvidia drivers installed.
Install Anaconda 2 or 3 from the official site.
https://www.anaconda.com/distribution/
Create a new environment and install tensorflow-gpu and cudatoolkit.
$conda create -n tf-gpu tensorflow-gpu python cudnn cudatoolkit
You can also specify the version of application.
E.g $conda create -n tf-gpu tensorflow-gpu python=3.5 cudnn cudatoolkit=8
Please do check if your hardware has the minimum compute capability to support the version of CUDA that you are/will be using.
If you can't pass --runtime=nvidia as a command-line option (eg docker-compose), you can set the default runtime in the Docker daemon config file /etc/docker/daemon.json:
{
"default-runtime": "nvidia"
}
js server in docker. On my Windows 10 64bit it works fine. But when i try to run the image on my raspberry pi: standard_init_linux.go:190: exec user process caused "exec format error".
mariu5 in the docker Forum has a workaround. But I do not know what to do with it.
https://forums.docker.com/t/standard-init-linux-go-190-exec-user-process-caused-exec-format-error/49368/4
Where can I updated the deployment.template.json file and has the Raspberry Pi 3 Model B+ a arm32 architecture?
You need to rebuild your image on the raspberry pi or find one that is compatible with it.
Perhaps this example might help:
https://github.com/hypriot/rpi-node-example-hello-world
The link you posted is not a work around but rather a "one can't do that".
You have to run the docker image that was built for a particular
Architecture on a docker node that is running that same Architecture.
Take docker out of the picture for a moment. You can’t run a Linux
application compiled and built on a Linux ARM machine and run it on a
Linux amd64 machine. You can’t run a Linux application compiled and
built on a Linux Power machine on a Linux amd64 machine.
https://forums.docker.com/t/standard-init-linux-go-190-exec-user-process-caused-exec-format-error/49368/4