What are the differences between the official PyTorch image on Docker Hub and the PyTorch image on NVIDIA NGC?
The NGC page is more documented than the Docker Hub page, which has no description. But the NGC image is also heavier by a few gigabytes, and it seems to require a CUDA 10.2-compatible driver.
Is there any advantage in using the NVIDIA NGC image instead of the one from Docker Hub?
I'm not a docker expert.
To the best of my knowledge Nvidia puts a lot of effort to ship GPU-optimized containers such that running GPU pytorch on Nvidia GPU with Nvidia container should have best possible performance.
Therefore, if you are using Nvidia hardware you should expect better performance using NGC containers. The gap, though, might not be that significant.
Related
firstly, I'm still beginner in docker.
I need to run multiple version of TensorFlow and each version requires a specific cuda version.
my host operating system is ubuntu 16.04
I need to have multiple versions of cuda on my OS since I'm working on multiple projects each requires a different versions of cuda. I tried to use conda and virtual environments to solve that problem. after a while I gave up and started to search for alternatives.
apparently virtual machines can't access GPU, only if you own a specif gpu type you can run the official NVIDIA visualizer.
I have a NVIDIA 1080 gpu. I installed a new image of Ubuntu 16.04 and started to work on dockerfiles to create custom images for my projects.
I was trying to avoid using docker to avoid complexity,after I failed in installing and running multiple versions of cuda I turned to docker. apparently you can't access cuda via docker directly if you don't install the cuda driver on the host machine.
I'm still not sure if I could run docker containers with a different cuda version than the one I installed in my pc.
if that is the case, NVIDIA messed up big time. usually if their is no need to use docker we avoid it to overcome additional complexities. when we need to work with multiple environments, and conda and virtual environment fail. we head out towards docker. so If nvidia limits the usage in docker container to one cuda version, they only intended to allow developers to work on one project of special dependencies per operating system.
please confirm if I can run containers that each have a specific cuda versions.
Moreover I will greatly appreciate if someone point out to a guide on how to use conda enviroments to build docker files and how to run conda env in docker container.
Having several CUDA versions is possible with Docker. Moreover, none of them needs to be at your host machine, you can have CUDA in a container and that's IMO is the best place for it.
To enable GPU support in container and make use of CUDA in it you need to have all of these installed:
Docker
(optionally but recommended) docker-compose
NVIDIA Container Toolkit
NVIDIA GPU Driver
Once you've obtained these you can simply grab one of the official tensorflow images (if the built-in python version fit your needs), install pip packages and start working in minutes. CUDA is included in the container image, you don't need it on host machine.
Here's an example docker-compose.yml to start a container with tensorflow-gpu. All the container does is a test whether any of GPU devices available.
version: "2.3" # the only version where 'runtime' option is supported
services:
test:
image: tensorflow/tensorflow:2.3.0-gpu
# Make Docker create the container with NVIDIA Container Toolkit
# You don't need it if you set 'nvidia' as the default runtime in
# daemon.json.
runtime: nvidia
# the lines below are here just to test that TF can see GPUs
entrypoint:
- /usr/local/bin/python
- -c
command:
- "import tensorflow as tf; tf.test.is_gpu_available(cuda_only=False, min_cuda_compute_capability=None)"
Running this with docker-compose up you should see a line with GPU specs in it. It looks like this and appears at the end:
test_1 | 2021-01-23 11:02:46.500189: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/device:GPU:0 with 1624 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1)
What are the differences between the official Tensorflow image on Docker Hub and the Tensorflow image on NVIDIA NGC?
I just want to train my model with both tf1/2. NGC images are so huge and both are working well. I don't see any difference between them. Wondering why there are two duplicated repository.
This may seem like a silly question, but I am a beginner to containerization concept and I was wondering why the ubuntu image size(~80mb) from docker hub is very much lesser than its iso file(~1.8GB)
Container is an isolated space of your Kernel.
On ubuntu:18.04 docker image, it doesn't contain entire Kernel binaries.
It only has some libraries and executions with some configuration required to run ubuntu:18.04, it still uses your host's Kernel.
You can take a look at how does ubuntu:18.04 image created from Dockerfile from here.
I recommend you to search how Docker use cgroups and namepaces to create container.
Docker images contains only necessary minimum library and tools which is in fact needed to be an operating system running.For Ubuntu docker images, its does not have any GUI(which is not used in container or rarely used) and most tools are also not included.Its just a base operating system.Alpine images are too much smaller than the Ubuntu images.
When i run the following command
docker run mongo
It will download the mongo image and run it on container.
I am running Linux on VM.
My OS details are as follows:
NAME="CentOS Linux"
VERSION="7 (Core)"
In case I am using different OS /Mac Machine / Windows, how does docker determine which image to pull. As I understand there is a single image on docker hub for mongo or is it that we can specify a specific image to run based on our OS.
At least we need take care of downloading specific version of mongo when doing installation on our local machine (when not using containers).
How is this taken care of by dockers.
Thanks.
The OS that you are running is for the most part irrelevant when it comes to pulling a docker image. As long as you are running docker (and the versions of docker are a little different from windows to Mac to Linux) on your host, you can pull any image you want. You can pull the same mongo image are run it in any operating system.
The image hides the host operating system making it easy to build an image an deploy pretty much in any machine.
Having said that you may be getting confused because image makers many times use different OS to build their applications. A quick example is people building application using an Ubuntu image but switching to an alpine based image for deployment because that is so much smaller. However, both images will run pretty much anywhere.
Probably you are confused with terms OS and Architecture?
The OS does not really matter, because, as #camba1 mentioned, the Docker daemon handles all that stuff.
What matters, is architecture, because Linux can run on ARM, AMD64, etc.
So, the Docker daemon must know which image is good for current architecture.
Here is a good article regarding this question.
I'm getting started with my Jetson Nano and I'm looking for an example that I can launch by running docker run xxx where xxx is some image at DockerHub that uses the GPU.
I assume I'll have to pass in some --device flags, but is there even any kind of "hello world"-style sample ready to go that uses the GPU from docker?
I'm hoping to just demonstrate that you can access the GPU from a docker container on the Jetson Nano. Mostly to make sure that my configuration is correct.
Nvidia Jetpack 4.2.1 enabled easy Docker with GPU on Jetson Nano.
See here for detailed instruction on how to get Docker and Kubernetes running on Jetson Nano with GPU:
https://medium.com/jit-team/building-a-gpu-enabled-kubernets-cluster-for-machine-learning-with-nvidia-jetson-nano-7b67de74172a
It uses a simple Docker Hub hosted image for TensorFlow:
You're not alone in wanting that, but you cannot do it at the moment. The NVIDIA Nano team are aware of the need, and the feature is expected later this year.
See https://devtalk.nvidia.com/default/topic/1050809/jetson-nano/docker-image-to-see-if-cuda-is-working-in-container-on-jetson-nano-/
At present you can run a docker container with TensorFlow or PyTorch installed. but it will only use the CPU, not the GPU.