In Nvidia's developer page (https://devblogs.nvidia.com/nvidia-docker-gpu-server-application-deployment-made-easy/)
It states that nvidia-docker provides "driver-agnostic CUDA images".
I would just like to inquire/clarify if this is only driver version specific or does this also apply to OS?
For example:
Host = CentOS
Docker Image/Container = Ubuntu
Does using nvidia-docker provide a way to utilize the CentOS's nvidia driver in the Ubuntu Docker Container?
Currently what I do is I always have 2 Docker files for supporting Ubuntu Host and CentOS Host and manually mount /dev/nvidia0 and copy the library files (or install the driver) inside the docker image.
I've asked this already to the Nvidia, but still waiting for them to answer.
I'll be trying it my self too to find out but I just thought to try my luck if anyone from SO already knows the answer.
Thank you in advance guys.
I've tested this and it does work.
"driver-agnostic CUDA images" is not only limitted to different versions of the driver but also across different OS (binary)
Thank you.
Related
firstly, I'm still beginner in docker.
I need to run multiple version of TensorFlow and each version requires a specific cuda version.
my host operating system is ubuntu 16.04
I need to have multiple versions of cuda on my OS since I'm working on multiple projects each requires a different versions of cuda. I tried to use conda and virtual environments to solve that problem. after a while I gave up and started to search for alternatives.
apparently virtual machines can't access GPU, only if you own a specif gpu type you can run the official NVIDIA visualizer.
I have a NVIDIA 1080 gpu. I installed a new image of Ubuntu 16.04 and started to work on dockerfiles to create custom images for my projects.
I was trying to avoid using docker to avoid complexity,after I failed in installing and running multiple versions of cuda I turned to docker. apparently you can't access cuda via docker directly if you don't install the cuda driver on the host machine.
I'm still not sure if I could run docker containers with a different cuda version than the one I installed in my pc.
if that is the case, NVIDIA messed up big time. usually if their is no need to use docker we avoid it to overcome additional complexities. when we need to work with multiple environments, and conda and virtual environment fail. we head out towards docker. so If nvidia limits the usage in docker container to one cuda version, they only intended to allow developers to work on one project of special dependencies per operating system.
please confirm if I can run containers that each have a specific cuda versions.
Moreover I will greatly appreciate if someone point out to a guide on how to use conda enviroments to build docker files and how to run conda env in docker container.
Having several CUDA versions is possible with Docker. Moreover, none of them needs to be at your host machine, you can have CUDA in a container and that's IMO is the best place for it.
To enable GPU support in container and make use of CUDA in it you need to have all of these installed:
Docker
(optionally but recommended) docker-compose
NVIDIA Container Toolkit
NVIDIA GPU Driver
Once you've obtained these you can simply grab one of the official tensorflow images (if the built-in python version fit your needs), install pip packages and start working in minutes. CUDA is included in the container image, you don't need it on host machine.
Here's an example docker-compose.yml to start a container with tensorflow-gpu. All the container does is a test whether any of GPU devices available.
version: "2.3" # the only version where 'runtime' option is supported
services:
test:
image: tensorflow/tensorflow:2.3.0-gpu
# Make Docker create the container with NVIDIA Container Toolkit
# You don't need it if you set 'nvidia' as the default runtime in
# daemon.json.
runtime: nvidia
# the lines below are here just to test that TF can see GPUs
entrypoint:
- /usr/local/bin/python
- -c
command:
- "import tensorflow as tf; tf.test.is_gpu_available(cuda_only=False, min_cuda_compute_capability=None)"
Running this with docker-compose up you should see a line with GPU specs in it. It looks like this and appears at the end:
test_1 | 2021-01-23 11:02:46.500189: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/device:GPU:0 with 1624 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1)
When i run the following command
docker run mongo
It will download the mongo image and run it on container.
I am running Linux on VM.
My OS details are as follows:
NAME="CentOS Linux"
VERSION="7 (Core)"
In case I am using different OS /Mac Machine / Windows, how does docker determine which image to pull. As I understand there is a single image on docker hub for mongo or is it that we can specify a specific image to run based on our OS.
At least we need take care of downloading specific version of mongo when doing installation on our local machine (when not using containers).
How is this taken care of by dockers.
Thanks.
The OS that you are running is for the most part irrelevant when it comes to pulling a docker image. As long as you are running docker (and the versions of docker are a little different from windows to Mac to Linux) on your host, you can pull any image you want. You can pull the same mongo image are run it in any operating system.
The image hides the host operating system making it easy to build an image an deploy pretty much in any machine.
Having said that you may be getting confused because image makers many times use different OS to build their applications. A quick example is people building application using an Ubuntu image but switching to an alpine based image for deployment because that is so much smaller. However, both images will run pretty much anywhere.
Probably you are confused with terms OS and Architecture?
The OS does not really matter, because, as #camba1 mentioned, the Docker daemon handles all that stuff.
What matters, is architecture, because Linux can run on ARM, AMD64, etc.
So, the Docker daemon must know which image is good for current architecture.
Here is a good article regarding this question.
I am trying to get gpu support on my container
without the nvidia-docker
I know with the nvidia docker, I just have to use
--runtime=nvidia but my current circumstances does not allow using nvidia-docker
I tried installing the nvidia driver, cuda, cudnn on my container but it fails.
How can I use tensorflow gpu without nvidia docker on my container?
You can use x11docker
Running a docker image on X with gpu is as simple as
x11docker --gpu imagename
You'll be happy to know that the latest Docker version now comes with support for nvidia GPU's. You'll need to use --device flag to specify your Nvidia driver. See - How to use GPU a docker container
Earlier, you had to install nvidia-docker which was plain docker with a thin layer of abstraction for nvidia GPU's. See - Nvidia Docker
You cannot simply install nvidia drivers in a docker container. The container must have access to the hardware. Though I'm not certain, but mounts might help you with that issue. See- https://docs.docker.com/storage/
You can use anaconda to install and use Tensorflow-gpu.
Make sure you have the latest nvidia drivers installed.
Install Anaconda 2 or 3 from the official site.
https://www.anaconda.com/distribution/
Create a new environment and install tensorflow-gpu and cudatoolkit.
$conda create -n tf-gpu tensorflow-gpu python cudnn cudatoolkit
You can also specify the version of application.
E.g $conda create -n tf-gpu tensorflow-gpu python=3.5 cudnn cudatoolkit=8
Please do check if your hardware has the minimum compute capability to support the version of CUDA that you are/will be using.
If you can't pass --runtime=nvidia as a command-line option (eg docker-compose), you can set the default runtime in the Docker daemon config file /etc/docker/daemon.json:
{
"default-runtime": "nvidia"
}
js server in docker. On my Windows 10 64bit it works fine. But when i try to run the image on my raspberry pi: standard_init_linux.go:190: exec user process caused "exec format error".
mariu5 in the docker Forum has a workaround. But I do not know what to do with it.
https://forums.docker.com/t/standard-init-linux-go-190-exec-user-process-caused-exec-format-error/49368/4
Where can I updated the deployment.template.json file and has the Raspberry Pi 3 Model B+ a arm32 architecture?
You need to rebuild your image on the raspberry pi or find one that is compatible with it.
Perhaps this example might help:
https://github.com/hypriot/rpi-node-example-hello-world
The link you posted is not a work around but rather a "one can't do that".
You have to run the docker image that was built for a particular
Architecture on a docker node that is running that same Architecture.
Take docker out of the picture for a moment. You can’t run a Linux
application compiled and built on a Linux ARM machine and run it on a
Linux amd64 machine. You can’t run a Linux application compiled and
built on a Linux Power machine on a Linux amd64 machine.
https://forums.docker.com/t/standard-init-linux-go-190-exec-user-process-caused-exec-format-error/49368/4
Does adding :cached on a volume mount for mac performance tuning effect docker for windows volume mounts?
I'm working on a team with both mac and windows machines and it seems to still work but want to see if anyone has more to add on this.
Here is the docker docs link https://docs.docker.com/docker-for-mac/osxfs-caching/
But they don't say thing about how it effects windows?
An example they give in their docs.
docker run -v /Users/yallop/project:/project:cached alpine command
Cheers.
It may only have an effect on mac。
detail
:delegated and :cached flags are redundant since Docker Desktop 2.4.0.0 where gRPC FUSE file sharing is used by default.
Source: https://github.com/docker/for-mac/issues/5402