Run Nvidia-docker on Jetson nano and jetson xavier for deep learning framework like tensorflow - nvidia

I am currently trying to run Nvidia-docker on Jetson Xavier and jetson nano with the Tensorflow framework enabled inside. but the problem I’m facing right now is related to “libcublas.so”.
What I had tried the solution mentioned here:
https://devtalk.nvidia.com/default/topic/1043951/jetson-agx-xavier/docker-gpu-acceleration-on-jetson-agx-for-ubuntu-18-04-image/post/5296647/#5296647 1
All package installations (pip installs and apt-get installs) completed successfully but when I try to import TensorFlow from both Python 2.7 or 3.6, I get the following error:
ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory
using Jetson Xavier or jetson nano?

sudo docker run --net=host --rm --runtime nvidia --ipc=host -v /tmp/.X11-unix/:/tmp/.X11-unix /tmp/argus_socket:/tmp/argus_socket --cap-add SYS_PTRACE -e DISPLAY=$DISPLAY -it [container]
I'm assuming it would work on Host first.
Source: official forum but up to date.
Device is no longer used

Related

RAPIDS.ai dependencies cuml and cudf not found no matter how I install

I have followed every version of the instructions on the AWS-EC2 setup for RAPIDS.ai: https://rapids.ai/cloud#AWS-EC2
I can confirm that I am using the exact instance type in the instructions, and following the steps exactly.
When I try to use the docker approach, the --gpus all command is not accepted.
When I try to use the conda approach, the install fails with the error:
PackageNotFoundError: Packages missing in current channels:
- glibc
I have tried (many) different solutions provided to solve both of these problems, none of them seem to work. I really just need to test some python code with cuml and cudf imports in a notebook. Been at this for 7 hours (after giving up on my local and SageMaker).
You note that the --gpus all command is not accepted, which suggests that you do not have the NVIDIA Docker runtime installed.
I followed the instructions you linked and I did run into an issue where the sudo yum install -y nvidia-docker2 command failed and I needed to disable an Amazon yum repo that was causing come conflicts as outlined in this issue.
$ sudo yum-config-manager --disable amzn2-graphics
$ sudo yum install -y nvidia-docker2
$ sudo yum-config-manager --enable amzn2-graphics
Once I'd done that and run sudo systemctl restart docker I was able to start the RAPIDS container.
$ docker run --gpus all --rm -it -p 8888:8888 -p 8787:8787 -p 8786:8786 rapidsai/rapidsai:cuda11.2-runtime-ubuntu18.04-py3.7
This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.download.nvidia.com/licenses/NVIDIA_Deep_Learning_Container_License.pdf
A JupyterLab server has been started!
To access it, visit http://localhost:8888 on your host machine.
Ensure the following arguments were added to "docker run" to expose the JupyterLab server to your host machine:
-p 8888:8888 -p 8787:8787 -p 8786:8786
Make local folders visible by bind mounting to /rapids/notebooks/host
(rapids) root#be7253bb4fdb:/rapids/notebooks#
Turns out, the frist AMI suggested in the documentation is not compatible. Use the Deep Learning NVIDIA one instead.

tensorflow:2.0.0- docker container unable to access to GPU

Hi to everybody and thanks in advance.
My goal is to use a GPU enabled container to execute the notebooks of Hands on Machine Learning book (2nd edition). The idea is to use the GPU enabled container, maybe adding some imports and then committing to create a new image.
I checked the prerequisites as from https://github.com/NVIDIA/nvidia-docker/wiki/Installation-(version-2.0).
O.S. Ubuntu 18.04
Processor: Intel® Core™ i7-7700HQ CPU # 2.80GHz × 8
Graphic card: GeForce GTX 1080/PCIe/SSE2
NVIDIA-SMI 418.87.00 and the graphic card is recognized,
docker Version: 19.03.5 API version: 1.40,
nvidia-docker2 is already the newest version (2.2.2-1),
nvidia-docker (the old version) is not present,
Executing:
docker pull tensorflow/tensorflow:2.0.0-gpu-py3-jupyter<br/><br/>
docker run -u $(id -u):$(id -g) -it --rm -v $(realpath ~/Projects/GDL/GDL_code):/tf/notebooks -p 8888:8888 tensorflow/tensorflow:2.0.0-gpu-py3-jupyter
the container starts regularly and I can use the the notebooks, but with no GPU support...
import tensorflow as tf
from tensorflow.python.client import device_lib
only the CPU is recognized...
I'm probably missing something of obvious ... I'm a newbie with docker and tensorflow...
Any help is appreciated!
Sorry, missing --runtime=nvidia ...
docker run --runtime=nvidia -u $(id -u):$(id -g) -it --rm -v $(realpath ~/Projects/GDL/GDL_code):/tf/notebooks -p 8888:8888 tensorflow/tensorflow:2.0.0-gpu-py3-jupyter
Thanks anyway to everybody!

Setting up CUDA with docker meets the permission not granted problem

I would like to setup cuda using the following code:
docker run -ti --rm --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=0 nvidia/cuda
I Kept getting these Errors:
Command 'docker' not found, but can be installed with:
snap install docker # version 18.06.1-ce, or
apt install docker.io # version 18.09.7-0ubuntu1~19.04.5
See 'snap info docker' for additional versions.
I tried to google these Errors, but failed.
System Environment: Ubuntu Desktop 19.04
I should explain that this is a clean System I'm currently using.
I should tell you one thing that, installing anything with docker comes with a prerequisite, which is that you should install docker first.
You can find the tutorials on how you could install docker in the following link:
How to install Docker
And then you could install Nvidia compiled docker container with the following command:
docker pull nvidia/cuda
docker run -ti --rm --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=0 nvidia/cuda
which was referenced from Nvidia CUDA Docker Hub and Nvidia CUDA GitHub page

Dockerfile for tensorflow GPU + opencv3 + Jupyter Notebook?

I am a bit confused. Now I am not using a dockerfile but the command:
docker run -it --rm -v $(realpath ~/path/of/directory):/tf/notebooks -p 8888:8888 tensorflow/tensorflow:latest-py3-jupyter
But I don't have access to cv2 module. Maybe add something to the run command or write my own dockerfile? but I don't know how. Do I have to pip install and RUN tensorflow GPU and jupyter notebook in the dockerfile ?
Check if this image tensorflow/tensorflow:latest-py3-jupyter has installed opencv and python module cv2.
If it doesn't you should find another image or install it by yourself.
Tensorflow image by default does not have OpenCV in it.
There are two ways you can set it up :
A - Setup tensorflow docker container and then install using commands
OR
B - Create a docker file which takes tensorflow image as base image and install opencv using commands while building the image.
Please refer to the below link to do the same.
https://github.com/fbcotter/docker-tensorflow-opencv
https://github.com/fbcotter/docker-tensorflow-opencv/blob/master/Dockerfile

Install TensorFlow APIs using Docker environment

I want to train a model using Tensorflow, so to avoid problems of dependence between python, CUDA, Tensorflow....., I decided to use Docker.
To install Tensorflow I used these commands
# install nvidia docker
docker run --runtime=nvidia --rm nvidia/cuda:9.0-base nvidia-smi
# install tensorflow docker
docker pull tensorflow/tensorflow:1.12.0-gpu-py3
# run
docker run --runtime=nvidia -it --rm -v /$(pwd)/Desktop/DeepLab:/notebooks -p 8888:8888 tensorflow/tensorflow:1.12.0-gpu-py3
Now, I want to install DeepLab models, and to do this I have to clone the git repo into tensorflow/models/research/.
I do some research on the internet to learn how to add these folders or where we find the location of tensorflow -docker- installation, but no results.
So please, could anyone help me.
Thankyou

Resources