docker over virtualbox cannot start - docker

I have a docker file that creates a valid image that runs on my Ubuntu 18.04.
For compatibility with other machines, I've tried to run the docker in a Virtual Box Ubuntu machine (and avoid any configuration errors that may occur).
my docker run command line:
docker run -id --net=host --rm --privileged --gpus=all --env="NVIDIA_DRIVER_CAPABILITIES=all" --env="DISPLAY" -e DISPLAY=:0 -v /tmp/.X11-unix:/tmp/.X11-unix:rw -v /run/user/1000/gdm/Xauthority:/root/.Xauthority --env="QT_X11_NO_MITSHM=1" --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v /home/git/:/git --name nirge_sim nirge-sim:1.0
The base docker file:
FROM gazebo:gzserver9-bionic
# nvidia-container-runtime
ENV NVIDIA_VISIBLE_DEVICES \
${NVIDIA_VISIBLE_DEVICES:-all}
ENV NVIDIA_DRIVER_CAPABILITIES \
${NVIDIA_DRIVER_CAPABILITIES:+$NVIDIA_DRIVER_CAPABILITIES,}graphics
# install Utilities
RUN apt-get update -y && apt-get install -y apt-utils curl ca-certificates wget \
&& rm -rf /var/lib/apt/lists/*
# install gazebo packages
RUN apt-get update -y && apt-get install -y --allow-unauthenticated --no-install-recommends \
libgazebo9-dev \
&& rm -rf /var/lib/apt/lists/*
# install ros packages
RUN sh -c 'echo "deb http://packages.ros.org/ros/ubuntu $(lsb_release -sc) main" > /etc/apt/sources.list.d/ros-latest.list'
RUN curl -s https://raw.githubusercontent.com/ros/rosdistro/master/ros.asc | apt-key add -
RUN apt-get update && apt-get install -y --allow-unauthenticated \
ros-melodic-desktop-full \
&& rm -rf /var/lib/apt/lists/*
RUN apt-get update && apt-get install -y --allow-unauthenticated --no-install-recommends \
ros-melodic-gazebo-ros-pkgs ros-melodic-gazebo-ros-control \
ros-melodic-gazebo-plugins ros-melodic-gazebo-ros ros-melodic-gazebo-ros\
ros-melodic-simulators \
&& rm -rf /var/lib/apt/lists/*
# final config for ros
RUN echo 'source /opt/ros/melodic/setup.bash' >> /root/.bashrc
RUN echo 'export LIBGL_ALWAYS_INDIRECT=1' >> /root/.bashrc
CMD ["bash"]
So this works on my Host, but not on my hosted host via virtual box.
the error is:
docker: Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: nvml error: driver not loaded: unknown.
Would appreciate any advice on this issue.

It appears that one of the dependencies (Gazebo) requires a dedicated GPU, one that is not simulated as a part of VirtualBox.
Nvidia cards tend to work well in Ubuntu
sourced from original site

Related

"docker:19.03-dind" could not select device driver "nvidia" with capabilities: [[gpu]]

I got a K8S+DinD issue:
launch Kubernetes cluster
start a main docker image and a DinD image inside this cluster
when running a job requesting GPU, got error could not select device driver "nvidia" with capabilities: [[gpu]]
Full error
http://localhost:2375/v1.40/containers/long-hash-string/start: Internal Server Error ("could not select device driver "nvidia" with capabilities: [[gpu]]")
exec to the DinD image inside of K8S pod, nvidia-smi is not available.
Some debugging and it seems it's due to the DinD is missing the Nvidia-docker-toolkit, I had the same error when I ran the same job directly on my local laptop docker, I fixed the same error by installing nvidia-docker2 sudo apt-get install -y nvidia-docker2.
I'm thinking maybe I can try to install nvidia-docker2 to the DinD 19.03 (docker:19.03-dind), but not sure how to do it? By multiple stage docker build?
Thank you very much!
update:
pod spec:
spec:
containers:
- name: dind-daemon
image: docker:19.03-dind
I got it working myself.
Referring to
https://github.com/NVIDIA/nvidia-docker/issues/375
https://github.com/Henderake/dind-nvidia-docker
First, I modified the ubuntu-dind image (https://github.com/billyteves/ubuntu-dind) to install nvidia-docker (i.e. added the instructions in the nvidia-docker site to the Dockerfile) and changed it to be based on nvidia/cuda:9.2-runtime-ubuntu16.04.
Then I created a pod with two containers, a frontend ubuntu container and the a privileged docker daemon container as a sidecar. The sidecar's image is the modified one I mentioned above.
But since this post is 3 year ago from now, I did spent quite some time to match up the dependencies versions, repo migration over 3 years, etc.
My modified version of Dockerfile to build it
ARG CUDA_IMAGE=nvidia/cuda:11.0.3-runtime-ubuntu20.04
FROM ${CUDA_IMAGE}
ARG DOCKER_CE_VERSION=5:18.09.1~3-0~ubuntu-xenial
RUN apt-get update -q && \
apt-get install -yq \
apt-transport-https \
ca-certificates \
curl \
gnupg-agent \
software-properties-common && \
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add - && \
add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable" && \
apt-get update -q && apt-get install -yq docker-ce docker-ce-cli containerd.io
# https://github.com/docker/docker/blob/master/project/PACKAGERS.md#runtime-dependencies
RUN set -eux; \
apt-get update -q && \
apt-get install -yq \
btrfs-progs \
e2fsprogs \
iptables \
xfsprogs \
xz-utils \
# pigz: https://github.com/moby/moby/pull/35697 (faster gzip implementation)
pigz \
# zfs \
wget
# set up subuid/subgid so that "--userns-remap=default" works out-of-the-box
RUN set -x \
&& addgroup --system dockremap \
&& adduser --system -ingroup dockremap dockremap \
&& echo 'dockremap:165536:65536' >> /etc/subuid \
&& echo 'dockremap:165536:65536' >> /etc/subgid
# https://github.com/docker/docker/tree/master/hack/dind
ENV DIND_COMMIT 37498f009d8bf25fbb6199e8ccd34bed84f2874b
RUN set -eux; \
wget -O /usr/local/bin/dind "https://raw.githubusercontent.com/docker/docker/${DIND_COMMIT}/hack/dind"; \
chmod +x /usr/local/bin/dind
##### Install nvidia docker #####
# Add the package repositories
RUN curl -fsSL https://nvidia.github.io/nvidia-docker/gpgkey | apt-key add --no-tty -
RUN distribution=$(. /etc/os-release;echo $ID$VERSION_ID) && \
echo $distribution && \
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
tee /etc/apt/sources.list.d/nvidia-docker.list
RUN apt-get update -qq --fix-missing
RUN apt-get install -yq nvidia-docker2
RUN sed -i '2i \ \ \ \ "default-runtime": "nvidia",' /etc/docker/daemon.json
RUN mkdir -p /usr/local/bin/
COPY dockerd-entrypoint.sh /usr/local/bin/
RUN chmod 777 /usr/local/bin/dockerd-entrypoint.sh
RUN ln -s /usr/local/bin/dockerd-entrypoint.sh /
VOLUME /var/lib/docker
EXPOSE 2375
ENTRYPOINT ["dockerd-entrypoint.sh"]
#ENTRYPOINT ["/bin/sh", "/shared/dockerd-entrypoint.sh"]
CMD []
When I use exec to login into the Docker-in-Docker container, I can successfully run nvidia-smi (which previously return not found error then cannot run any GPU resource related docker run)
Welcome to pull my image at brandsight/dind:nvidia-docker

Problems installing csvtk with Docker using debian:stretch-slim

I am a newbie to Docker and I am trying to install csvtk via Docker using debian:stretch-slim.
This below is my Dockerfile
FROM debian:stretch-slim
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
jq \
perl \
python3 \
wget \
&& rm -rf /var/lib/apt/lists/*
RUN wget -qO- https://github.com/shenwei356/csvtk/releases/download/v0.23.0/csvtk_linux_amd64.tar.gz | tar -xz \
&& cp csvtk /usr/local/bin/
It fails at the csvtk step with the below error message:
Step 3/3 : RUN wget -qO- https://github.com/shenwei356/csvtk/releases/download/v0.23.0/csvtk_linux_amd64.tar.gz | tar -xz && cp csvtk /usr/local/bin/
---> Running in 0f3a0e75a5de
gzip: stdin: unexpected end of file
tar: Child returned status 1
tar: Error is not recoverable: exiting now
The command '/bin/sh -c wget -qO- https://github.com/shenwei356/csvtk/releases/download/v0.23.0/csvtk_linux_amd64.tar.gz | tar -xz && cp csvtk /usr/local/bin/' returned a non-zero code: 2
I would appreciate any help/suggestions.
Thanks in advance.
wget was exiting with error code meaning 5 SSL verification failed on wget. From this answer, you just needed to install ca-certificates before wget.
This Dockerfile should build successfully:
FROM debian:stretch-slim
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
jq \
perl \
python3 \
wget \
# added this package to help with ssl certs in Docker
ca-certificates \
&& rm -rf /var/lib/apt/lists/*
RUN wget -qO- https://github.com/shenwei356/csvtk/releases/download/v0.23.0/csvtk_linux_amd64.tar.gz | tar -xz \
&& cp csvtk /usr/local/bin/
As a general tip when debugging issues like these, it's likely easiest to remove the offending RUN line from your Dockerfile and then try building and running the container in a shell and manually executing the commands you want. Like this:
docker build -t test:v1 .
docker run --rm -it test:v1 /bin/bash
# run commands manually and check the full error output
While combining different RUN instructions with && is best practice to reduce the number of image layers, it's difficult to debug when building.

How to run GNOME on Docker with Wayland

I want to try to run GNOME Desktop using Wayland on Docker.
How to do it?
There are some errors I have tried it with following Docker file.
How do I run GNOME DESKTOP on Docker?
Environments:
Host OS - Arch Linux with Wayland
Container - Ubuntu 20.10
Error:
dbus[8]: Unable to set up transient service directory: XDG_RUNTIME_DIR "/tmp" is owned by uid 0, not our uid 1000
** (process:6): WARNING **: 14:32:13.386: Could not make bus activated clients aware of XDG_CURRENT_DESKTOP=GNOME environment variable: Could not connect: Connection refused
libEGL warning: wayland-egl: could not open /dev/dri/card0 (No such file or directory)
gnome-session-binary[6]: WARNING: Could not make bus activated clients aware of QT_IM_MODULE=ibus environment variable: Could not connect: Connection refused
gnome-session-binary[6]: WARNING: Could not make bus activated clients aware of XMODIFIERS=#im=ibus environment variable: Could not connect: Connection refused
gnome-session-binary[6]: WARNING: Could not make bus activated clients aware of GNOME_DESKTOP_SESSION_ID=this-is-deprecated environment variable: Could not connect: Connection refused
gnome-session-binary[6]: WARNING: Could not make bus activated clients aware of XDG_MENU_PREFIX=gnome- environment variable: Could not connect: Connection refused
gnome-session-binary[6]: ERROR: Failed to connect to system bus: Could not connect: No such file or directory
aborting...
Trace/breakpoint trap (core dumped)
Dockerfile:
from ubuntu:rolling
env DEBIAN_FRONTEND noninteractive
run dpkg --add-architecture i386
run apt-get update -y && apt-get install -y apt-utils && \
apt-get upgrade -y && \
apt-get install -y gnome-session && \
apt-get install -y xorg && \ # install xorg but it does not use
apt-get install -y net-tools wget && \
apt-get install -y sudo && \
apt-get update -y && \
apt-get install -y wine64 && \
apt-get install -y wine32 && \
apt-get install -y cabextract
run useradd -s /bin/bash -m kiyugad && gpasswd -a kiyugad sudo
run wget https://desktop.line-scdn.net/win/new/LineInst.exe
run wget https://raw.githubusercontent.com/Winetricks/winetricks/master/src/winetricks && \
mv winetricks /usr/local/bin/winetricks && \
cd /usr/local/bin && \
chmod +x winetricks
run winetricks corefonts fakejapanese_ipamona fakejapanese_vlgothic
cmd gnome-session
Docker run command:
docker run -e XDG_RUNTIME_DIR=/tmp \
-e WAYLAND_DISPLAY=$WAYLAND_DISPLAY \
-e QT_QPA_PLATFORM=wayland \
-e GDK_BACKEND=wayland \
-e CLUTTER_BACKEND=wayland \
-e DISPLAY=:0 \
--net=host \
-v $XDG_RUNTIME_DIR/$WAYLAND_DISPLAY:/tmp/$WAYLAND_DISPLAY \
--user=$(id -u):$(id -g) \
imagename

How to install aws-cli in docker container based on maven:3.6.3-openjdk-14 image?

I would like to install aws-cli for below images but I received below error. I tried with apk, apt but none of then did not work. Can you please help how should I update my dockerfile?
I do not want to change my base image, I need to use maven:3.6.3-openjdk-14.
sh: apt-get: command not found
FROM maven:3.6.3-openjdk-14
RUN apt-get update \
&& apt-get install -y vim jq unzip curl \
&& apt-get upgrade -y \
#install aws 2
RUN curl --silent --show-error --fail "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" && \
unzip awscliv2.zip && \
./aws/install && \
rm -rf awscliv2.zip
Docker image maven:3.6.3-openjdk-14 is based on Oracle Linux which uses rpm to manage packages, so apt-get is not available.
docker run -i -t maven:3.6.3-openjdk-14 -- cat /etc/os-release

Running Elasticsearch with Docker

I installed Elasticsearch in my image based on ubuntu:16.04.
And start the service using
RUN service elasticsearch start
but, it was not started.
If I go into the container and run it, it starts.
I want to run the service and dump the index when I create the image, below is a part of my Dockerfile.
How do I start Elasticsearch in the Dockerfile?
#install OpenJDK-8
RUN apt-get update && apt-get install -y openjdk-8-jdk && apt-get install -y ant && apt-get clean
RUN apt-get update && apt-get install -y ca-certificates-java && apt-get clean
RUN update-ca-certificates -f
ENV JAVA_HOME /usr/lib/jvm/java-8-openjdk-amd64/
RUN export JAVA_HOME
#download ES
RUN wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | apt-key add -
RUN apt-get install -y apt-transport-https
RUN echo "deb https://artifacts.elastic.co/packages/6.x/apt stable main" | tee -a /etc/apt/sources.list.d/elastic-6.x.list
RUN apt-get update && apt-get install -y elasticsearch
RUN service elasticsearch start
The RUN command executes only during the build phase. It stops after the build is completed. You should use CMD (or ENTRYPOINT) instead:
CMD service elasticsearch start && /bin/bash
It's better wrapping the starting command in your own file and then only execute the file:
CMD /start_elastic.sh
I don't know why not take official oss image, but, this Docker file based on Debian work:
FROM java:8-jre
ENV ES_NAME=elasticsearch \
ELASTICSEARCH_VERSION=6.6.1
ENV ELASTICSEARCH_URL=https://artifacts.elastic.co/downloads/$ES_NAME/$ES_NAME-$ELASTICSEARCH_VERSION.tar.gz
RUN apt-get update && apt-get install -y --assume-yes openssl bash curl wget \
&& mkdir -p /opt \
&& echo '[i] Start create elasticsearch' \
&& wget -T 15 -O /tmp/$ES_NAME-$ELASTICSEARCH_VERSION.tar.gz $ELASTICSEARCH_URL \
&& tar -xzf /tmp/$ES_NAME-$ELASTICSEARCH_VERSION.tar.gz -C /opt/ \
&& ln -s /opt/$ES_NAME-$ELASTICSEARCH_VERSION /opt/$ES_NAME \
&& useradd elastic \
&& mkdir -p /var/lib/elasticsearch /opt/$ES_NAME/plugins /opt/$ES_NAME/config/scripts \
&& chown -R elastic /opt/$ES_NAME-$ELASTICSEARCH_VERSION/
ENV PATH=/opt/elasticsearch/bin:$PATH
USER elastic
CMD [ "/bin/sh", "-c", "/opt/elasticsearch/bin/elasticsearch --E cluster.name=test --E network.host=0 $ELASTIC_CMD_OPTIONS" ]
I believe most of the commands you'll be able to use on Ubuntu.
Don't forget to run sudo sysctl -w vm.max_map_count=262144 on your host

Resources