Create a Docker container for Machine Learning - docker

I'm not a Docker expert and I'm trying to create a container for machine learning projects. This is mostly for academic purpose, as I'm studying machine learning.
I wrote a dockerfile (and a devcontainer.json file to open the container in vscode) that runs fine until I add the lines to build tensorflow. I found three problems but I don't know what I'm missing:
I got a warning about the fact that Bazel installation isn't a release version
When the building phase arrives at ./configure I can't configure through prompt question
I got an error at the very building phase with
ERROR: The project you're trying to build requires Bazel 5.3.0 (specified in /docklearning/tensorflow/.bazelversion), but it wasn't found in /usr/bin.
This is the Dockerfile:
FROM nvidia/cuda:12.0.0-runtime-ubuntu22.04 as base
ARG USER_UID=1000
#switch to non-interactive frontend
ENV DEBIAN_FRONTEND=noninteractive
WORKDIR /docklearning
ADD . /docklearning
# Install packages
RUN apt-get update -q && apt-get install -q -y --no-install-recommends \
apt-transport-https curl gnupg apt-utils wget gcc g++ npm unzip build-essential ca-certificates curl git gh \
make nano iproute2 nano openssh-client openssl procps \
software-properties-common bzip2 subversion neofetch \
fontconfig && \
curl -fsSL https://bazel.build/bazel-release.pub.gpg | gpg --dearmor >bazel-archive-keyring.gpg && \
mv bazel-archive-keyring.gpg /usr/share/keyrings && \
echo "deb [arch=amd64 signed-by=/usr/share/keyrings/bazel-archive-keyring.gpg] https://storage.googleapis.com/bazel-apt stable jdk1.8" \
| tee /etc/apt/sources.list.d/bazel.list && \
apt update && apt install -q -y bazel && \
apt-get full-upgrade -q -y && \
cd ~ && \
wget https://github.com/ryanoasis/nerd-fonts/releases/download/v2.1.0/Meslo.zip && \
mkdir -p .local/share/fonts && \
unzip Meslo.zip -d .local/share/fonts && \
cd .local/share/fonts && rm *Windows* && \
cd ~ && \
rm Meslo.zip && \
fc-cache -fv && \
apt-get install -y zsh zsh-doc chroma
# Install anaconda
RUN wget https://repo.anaconda.com/archive/Anaconda3-2022.10-Linux-x86_64.sh -O Anaconda.sh && \
/bin/bash Anaconda.sh -b -p /opt/conda && \
rm Anaconda.sh
# Install LSD for ls substitute and clean up
RUN wget https://github.com/Peltoche/lsd/releases/download/0.23.1/lsd_0.23.1_amd64.deb -P /tmp && \
dpkg -i /tmp/lsd_0.23.1_amd64.deb && \
rm /tmp/lsd_0.23.1_amd64.deb && \
apt-get autoremove -y && \
apt-get autoclean && \
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*/apt/lists/* && \
useradd -r -m -s /bin/bash -u ${USER_UID} docklearning
# Add conda to PATH
ENV PATH=/opt/conda/bin:$PATH
ENV HOME=/home/docklearning
# Make zsh default shell
RUN chsh -s /usr/bin/zsh docklearning
# link conda to /etc/profile.d
RUN ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh
# Update Conda Base env
RUN conda update -n base --all -y && \
conda install -c conda-forge bandit opt_einsum keras-preprocessing && \
conda clean -a -q -y
# Download Tensorflow from github and build from source
RUN git clone https://github.com/tensorflow/tensorflow.git && \
cd tensorflow && \
./configure && \
bazel build --config=opt --config=cuda --cxxopt="-mavx2" //tensorflow/tools/pip_package:build_pip_package && \
./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg && \
pip install /tmp/tensorflow_pkg/tensorflow-*.whl && \
cd .. && \
rm -rf tensorflow
# Create shell history file
RUN mkdir ${HOME}/zsh_history && \
chown docklearning ${HOME}/zsh_history && \
mkdir ${HOME}/.ssh
# Switch to internal user
USER docklearning
WORKDIR ${HOME}
# Copy user configuration files
COPY --chown=docklearning ./config/.aliases.sh ./
COPY --chown=docklearning ./config/.bashrc ./
COPY --chown=docklearning ./config/.nanorc ./
# Configure Zsh for internal user
ENV ZSH=${HOME}/.oh-my-zsh
ENV ZSH_CUSTOM=${ZSH}/custom
ENV ZSH_PLUGINS=${ZSH_CUSTOM}/plugins
ENV ZSH_THEMES=${ZSH_CUSTOM}/themes
RUN wget -qO- https://raw.githubusercontent.com/ohmyzsh/ohmyzsh/master/tools/install.sh | zsh || true
RUN git clone --single-branch --branch 'master' --depth 1 https://github.com/zsh-users/zsh-syntax-highlighting.git ${ZSH_PLUGINS}/zsh-syntax-highlighting \
&& git clone --single-branch --branch 'master' --depth 1 https://github.com/zsh-users/zsh-autosuggestions ${ZSH_PLUGINS}/zsh-autosuggestions \
&& git clone --single-branch --depth 1 https://github.com/romkatv/powerlevel10k.git ${ZSH_THEMES}/powerlevel10k
COPY --chown=docklearning ./config/.p10k.zsh ./
COPY --chown=docklearning ./config/.zshrc ./
CMD [ "/bin/zsh" ]

I wouldn't say this question about docker or TensorFlow. It is about package installation and OS configuration.
In this case, you need to fix the bazel version first by specifying it explicitly:
apt install -q -y bazel-5.3.0
But that package is quite wired and doesn't create a symlink, so you need to create it:
ln -s /bin/bazel-5.3.0 /bin/bazel
You can validate if bazel installed by running it:
`bazel --version
bazel 5.3.0`

Related

Kartoza Geoserver on Heroku

I am new to docker and not an IT-specialist. I try to install Kartoza Geoserver in Docker on Heroku, but so far no success. Does anyone has experience with this and can explain me the settings in the dockerfile and the steps specifically for a Heroku install?
So far I tried a build with a modified dockerfile but I always get the same error (in the log trail) when opening/launching geoserver on Heroku:
"Error: groupadd: cannot open /etc/group".
I guess it is an permission/privileges issue.
Any sharing of experience on modifying the docker file so that the image is read by Heroku would be helpfull.
Modifying the settings in the dockerfile:
Removed the port forwarding from dockerfile
Add RUN adduser -D myuser USER myuser to dockerfile
Result dockerfile:
#--------- Generic stuff all our Dockerfiles should start with so we get caching ------------
ARG IMAGE_VERSION=9.0.65-jdk11-openjdk-slim-buster
ARG JAVA_HOME=/usr/local/openjdk-11
FROM tomcat:$IMAGE_VERSION
LABEL maintainer="Tim Sutton<tim#linfiniti.com>"
ARG GS_VERSION=2.22.0
ARG WAR_URL=https://downloads.sourceforge.net/project/geoserver/GeoServer/${GS_VERSION}/geoserver-${GS_VERSION}-war.zip
ARG STABLE_PLUGIN_BASE_URL=https://sourceforge.net/projects/geoserver/files/GeoServer
ARG DOWNLOAD_ALL_STABLE_EXTENSIONS=1
ARG DOWNLOAD_ALL_COMMUNITY_EXTENSIONS=1
ARG HTTPS_PORT=8443
ENV DEBIAN_FRONTEND=noninteractive
#Install extra fonts to use with sld font markers
RUN adduser -D myuser
USER myuser
RUN set -eux; \
apt-get update; \
apt-get -y --no-install-recommends install \
locales gnupg2 wget ca-certificates rpl pwgen software-properties-common iputils-ping \
apt-transport-https curl gettext fonts-cantarell lmodern ttf-aenigma \
ttf-bitstream-vera ttf-sjfonts tv-fonts libapr1-dev libssl-dev \
wget zip unzip curl xsltproc certbot cabextract gettext postgresql-client figlet gosu gdal-bin; \
# Install gdal3 - bullseye doesn't build libgdal-java anymore so we can't upgrade
curl https://deb.meteo.guru/velivole-keyring.asc | apt-key add - \
&& echo "deb https://deb.meteo.guru/debian buster main" > /etc/apt/sources.list.d/meteo.guru.list \
&& apt-get update \
&& apt-get -y --no-install-recommends install gdal-bin libgdal-java; \
dpkg-divert --local --rename --add /sbin/initctl \
&& (echo "Yes, do as I say!" | apt-get remove --force-yes login) \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*; \
# verify that the binary works
gosu nobody true
ENV \
JAVA_HOME=${JAVA_HOME} \
DEBIAN_FRONTEND=noninteractive \
GEOSERVER_DATA_DIR=/opt/geoserver/data_dir \
GDAL_DATA=/usr/share/gdal \
LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/tomcat/native-jni-lib:/usr/lib/jni:/usr/local/apr/lib:/opt/libjpeg-turbo/lib64:/usr/lib:/usr/lib/x86_64-linux-gnu" \
FOOTPRINTS_DATA_DIR=/opt/footprints_dir \
GEOWEBCACHE_CACHE_DIR=/opt/geoserver/data_dir/gwc \
CERT_DIR=/etc/certs \
RANDFILE=/etc/certs/.rnd \
FONTS_DIR=/opt/fonts \
GEOSERVER_HOME=/geoserver \
EXTRA_CONFIG_DIR=/settings \
COMMUNITY_PLUGINS_DIR=/community_plugins \
STABLE_PLUGINS_DIR=/stable_plugins
WORKDIR /scripts
ADD resources /tmp/resources
ADD build_data /build_data
ADD scripts /scripts
RUN echo $GS_VERSION > /scripts/geoserver_version.txt ;\
chmod +x /scripts/*.sh;/scripts/setup.sh \
&& apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
RUN echo 'figlet -t "Kartoza Docker GeoServer"' >> ~/.bashrc
WORKDIR ${GEOSERVER_HOME}
ENTRYPOINT ["/bin/bash", "/scripts/entrypoint.sh"]
Then build it with argument --platform linux/amd64 (I use arm64 architecture). Then pushed it to Heroku. All the time I get the same error.

docker-compose.yml - container with exited status on Ubuntu host

My docker-compose.yml:
version: "3.3"
services:
build_and_run_service:
image: myapp:0
build: .
network_mode: host
volumes:
- './bin/cookie:/app/cookie'
- './bin/logs:/app/logs'
- './bin/warehouse:/app/warehouse'
Dockerfile doesn't contain CMD and ENTRYPOINT, so when I execute commands in that order:
docker build --tag myapp:0 .
docker run -d -t myapp:0
docker exec -it <container_id> /bin/bash
It works as expected.
For some reason the container is not working when using docker compose...
Commands order:
docker-compose up -d --build
docker-compose run -d build_and_run_service bash
What's wrong?
Both cases work fine on Windows but not on Ubuntu...
#edit
Dockerfile:
FROM ubuntu:20.04 as runtime
LABEL description="Build and run container - myapp"
RUN apt-get update
RUN apt-get install -y software-properties-common
RUN apt-get install -y nano
RUN apt-get install -y wget
RUN apt-get install -y curl
RUN apt-get install -y make
RUN apt-get install -y build-essential
RUN apt-get install -y tcl zlib1g-dev libssl-dev tk libcurl4-gnutls-dev libexpat1-dev gettext dos2unix
# Compilers
RUN apt-get install -y gcc-10
RUN apt-get install -y g++-10
RUN rm /usr/bin/gcc \
&& ln -s /usr/bin/gcc-10 /usr/bin/gcc
RUN rm /usr/bin/g++ \
&& ln -s /usr/bin/g++-10 /usr/bin/g++
# Postgres dev
RUN sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
RUN wget --no-check-certificate --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add -
RUN apt-get update
RUN apt-get install -y libpq-dev postgresql-server-dev-13
RUN cd /tmp \
&& wget --no-check-certificate https://www.openssl.org/source/openssl-1.1.1g.tar.gz \
&& tar -zxf openssl-1.1.1g.tar.gz \
&& cd openssl-1.1.1g \
&& ./config \
&& make \
&& make install \
&& rm /usr/bin/openssl \
&& ln -s /usr/local/bin/openssl /usr/bin/openssl \
&& ldconfig
RUN cd /tmp \
&& wget --no-check-certificate https://cmake.org/files/v3.19/cmake-3.19.6-Linux-x86_64.tar.gz \
&& tar -zxf cmake-3.19.6-Linux-x86_64.tar.gz \
&& mv cmake-3.19.6-Linux-x86_64 /usr/local/ \
&& ln -s /usr/local/cmake-3.19.6-Linux-x86_64/bin/cmake /usr/bin/cmake
RUN cd /tmp \
&& wget --no-check-certificate https://mirrors.edge.kernel.org/pub/software/scm/git/git-2.31.0.tar.gz \
&& tar -zxf git-2.31.0.tar.gz \
&& cd git-2.31.0 \
&& make prefix=/usr/local all \
&& make prefix=/usr/local install
RUN cd /tmp \
&& wget --no-check-certificate https://boostorg.jfrog.io/artifactory/main/release/1.75.0/source/boost_1_75_0.tar.gz \
&& tar -zxf boost_1_75_0.tar.gz \
&& cd boost_1_75_0 \
&& ./bootstrap.sh \
&& ./b2 \
&& ./b2 install
VOLUME ["/app/cookie", "/app/logs", "/app/warehouse"]
WORKDIR /app
COPY . /src
RUN cd /src \
&& mkdir build \
&& cd build
# Some building command
## PRIVATE ##
# Removes tmp
RUN cd /tmp \
&& rm -r *

NVIDIA Driver Not found during Nvidia + Cuda - Docker Image build

I am trying to create a GPU microservice using Nvidia cuda Base image, but during the docker build, I am facing Driver not found issue, can someone point out what is missing here?
DockerFile:
FROM nvidia/cuda:10.1-devel
# Install some basic utilities
RUN apt-get update && apt-get install -y \
curl \
ca-certificates \
sudo \
git \
bzip2 \
libx11-6 \
&& rm -rf /var/lib/apt/lists/*
ENV CONDA_AUTO_UPDATE_CONDA=false
ENV PATH=/home/user/miniconda/bin:$PATH
RUN curl -sLo ~/miniconda.sh https://repo.continuum.io/miniconda/Miniconda3-py37_4.8.2-Linux-x86_64.sh \
&& chmod +x ~/miniconda.sh \
&& ~/miniconda.sh -b -p ~/miniconda \
&& rm ~/miniconda.sh \
&& conda install -y python==3.7 \
&& conda clean -ya
ENV PATH="/usr/local/cuda-10.1/bin:$PATH"
ENV LD_LIBRARY_PATH="/usr/local/cuda-10.1/lib64:$LD_LIBRARY_PATH"
ENV NVIDIA_DRIVER_CAPABILITIES=compute,utility
ENV NVIDIA_VISIBLE_DEVICES=all
ENV FORCE_CUDA="1"
RUN conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch
RUN pip install -v -e .
Error:
"/home/user/miniconda/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1013, in _get_cuda_arch_flags
capability = torch.cuda.get_device_capability()
File "/home/user/miniconda/lib/python3.7/site-packages/torch/cuda/__init__.py", line 320, in get_device_capability
prop = get_device_properties(device)
File "/home/user/miniconda/lib/python3.7/site-packages/torch/cuda/__init__.py", line 325, in get_device_properties
_lazy_init() # will define _get_device_properties and _CudaDeviceProperties
File "/home/user/miniconda/lib/python3.7/site-packages/torch/cuda/__init__.py", line 196, in _lazy_init
_check_driver()
File "/home/user/miniconda/lib/python3.7/site-packages/torch/cuda/__init__.py", line 101, in _check_driver
http://www.nvidia.com/Download/index.aspx""")
AssertionError:
Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx
The issues happens during execution of last step in docker file.
I tried using multiple Nvidia base docker images, but didn't help much. (cuda:10.1-base-ubuntu18.04, cuda:10.1-runtime-ubuntu18.04)
Any pointers appreciated.
After lot of trial and errors and going through a lot of documentation, this is what worked fine.
ARG PYTORCH=1.3
ARG CUDA=10.1
ARG CUDNN=7
FROM pytorch/pytorch:1.3-cuda10.1-cudnn7-devel
RUN mkdir /app
WORKDIR /app
ENV TORCH_CUDA_ARCH_LIST="5.2 6.0 6.1 7.0+PTX"
ENV TORCH_NVCC_FLAGS="-Xfatbin -compress-all"
ENV CMAKE_PREFIX_PATH="$(dirname $(which conda))/../"
RUN apt-get update && apt-get install -y libglib2.0-0 libsm6 libxrender-dev libxext6 \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
# Install some basic utilities
RUN apt-get update && apt-get install -y \
curl \
ca-certificates \
sudo \
git \
bzip2 \
libx11-6 \
&& rm -rf /var/lib/apt/lists/*
RUN apt-get update && \
apt-get install -y --no-install-recommends \
build-essential g++ \
libglib2.0-0 libsm6 libxrender-dev libxext6 wget
# Create a non-root user and switch to it
RUN adduser --disabled-password --gecos '' --shell /bin/bash user \
&& chown -R user:user /app
RUN echo "user ALL=(ALL) NOPASSWD:ALL" > /etc/sudoers.d/90-user
USER user
# All users can use /home/user as their home directory
ENV HOME=/home/user
RUN chmod 777 /home/user
# Install Miniconda and Python 3.7
ENV CONDA_AUTO_UPDATE_CONDA=false
ENV PATH=/home/user/miniconda/bin:$PATH
RUN curl -sLo ~/miniconda.sh https://repo.continuum.io/miniconda/Miniconda3-py37_4.8.2-Linux-x86_64.sh \
&& chmod +x ~/miniconda.sh \
&& ~/miniconda.sh -b -p ~/miniconda \
&& rm ~/miniconda.sh \
&& conda install -y python==3.7 \
&& conda clean -ya
RUN conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch
RUN pip install -v -e .
Hope this helps!
Good luck!

Unable to push clojure project on heroku with dockerfile

I have the following Dockerfile:
# We will use Ubuntu for our image
FROM ubuntu:latest
# Updating Ubuntu packages
ARG CLOJURE_TOOLS_VERSION=1.10.1.507
RUN apt-get -qq update && apt-get -qq -y install curl wget bzip2 openjdk-8-jdk-headless \
&& curl -sSL https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -o /tmp/miniconda.sh \
# && curl -sSL https://repo.continuum.io/miniconda/Miniconda3-latest-MacOSX-x86_64.sh -o /tmp/miniconda.sh \
&& bash /tmp/miniconda.sh -bfp /usr/local \
&& rm -rf /tmp/miniconda.sh \
&& conda install -y python=3 \
&& conda update conda \
&& curl -o install-clojure https://download.clojure.org/install/linux-install-${CLOJURE_TOOLS_VERSION}.sh \
&& chmod +x install-clojure \
&& ./install-clojure && rm install-clojure \
# no need to install lein
&& wget https://raw.githubusercontent.com/technomancy/leiningen/stable/bin/lein \
&& chmod a+x lein \
&& mv lein /usr/bin \
&& apt-get -qq -y autoremove \
&& apt-get autoclean \
&& rm -rf /var/lib/apt/lists/* /var/log/dpkg.log \
&& conda clean --all --yes
ENV PATH /opt/conda/bin:$PATH
RUN conda create -n pyclj python=3.7 && conda install -n pyclj numpy mxnet \
&& conda install -c conda-forge opencv
## To install pip packages into the pyclj environment do
RUN conda run -n pyclj python3 -mpip install numpy opencv-python
FROM openjdk:8-alpine
RUN lein uberjar
COPY target/uberjar/vendo.jar /vendo/app.jar
EXPOSE 3000
CMD ["java", "-jar", "/vendo/app.jar", "--server.port=$PORT"]
And I'm pushing my project using git push heroku master, and I get the error:
remote: Step 8/11 : RUN lein uberjar
remote: ---> Running in 07533c6b0e9c
remote: /bin/sh: lein: not found
remote: The command '/bin/sh -c lein uberjar' returned a non-zero code: 127
Suggesting that lein wasn't installed. The wget in that first RUN command is supposed to install lein. How do I fix this issue?
You need to add a RUN lein to run the self-install.

Dockerfile not able to find ionic or cordova after install

I have a Dockerfile for which I'm trying to install ionic and cordova. The installation seems to work without error but I'm not able use them inside the container. The thing is, they are installed globally so I not sure what I'm missing. Any assistance is appreciated.
FROM openjdk:8
MAINTAINER me
# Install Git and dependencies
RUN dpkg --add-architecture i386 \
&& apt-get update \
&& apt-get install -y file git curl zip libncurses5:i386 libstdc++6:i386 zlib1g:i386 \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists /var/cache/apt
RUN curl -sL https://deb.nodesource.com/setup_8.x | bash -
RUN apt-get install -y nodejs
RUN mkdir -p /tmp/yarn && \
mkdir -p /opt/yarn/dist && \
cd /tmp/yarn && \
wget -q https://yarnpkg.com/latest.tar.gz && \
tar zvxf latest.tar.gz && \
find /tmp/yarn -maxdepth 2 -mindepth 2 -exec mv {} /opt/yarn/dist/ \; && \
rm -rf /tmp/yarn
RUN ln -sf /opt/yarn/dist/bin/yarn /usr/local/bin/yarn && \
ln -sf /opt/yarn/dist/bin/yarn /usr/local/bin/yarnpkg && \
yarn --version
RUN yarn global add ionic
RUN yarn global add cordova
# Set up environment variables
ENV ANDROID_HOME="/home/user/android-sdk-linux" \
SDK_URL="https://dl.google.com/android/repository/sdk-tools-linux-3859397.zip" \
GRADLE_URL="https://services.gradle.org/distributions/gradle-4.5.1-all.zip"
# Create a non-root user
RUN useradd -m user
USER user
WORKDIR /home/user
# Download Android SDK
RUN mkdir "$ANDROID_HOME" .android \
&& cd "$ANDROID_HOME" \
&& curl -o sdk.zip $SDK_URL \
&& unzip sdk.zip \
&& rm sdk.zip \
&& yes | $ANDROID_HOME/tools/bin/sdkmanager --licenses
# Install Gradle
RUN wget $GRADLE_URL -O gradle.zip \
&& unzip gradle.zip \
&& mv gradle-4.5.1 gradle \
&& rm gradle.zip \
&& mkdir .gradle
ENV PATH="/home/user/gradle/bin:${ANDROID_HOME}/tools:${ANDROID_HOME}/platform-tools:${PATH}"

Resources