Apache Beam not recognizing custom container arguments - docker

I'm trying to use a local docker image to run a Beam pipeline, but it looks like this image has not been recognized even after I follow the Beam documentation suggested steps (https://beam.apache.org/documentation/runtime/environments/).
I performed the following steps:
Created a Dockerfile with my custom dependencias (pypostal and fuzzywuzzy):
Dockerfile
FROM apache/beam_python3.7_sdk:2.25.0
## System Dependencies
RUN apt-get update && apt-get upgrade -y && apt-get clean
ENV TZ=America
RUN DEBIAN_FRONTEND="noninteractive" apt-get -y install tzdata
# Python package management and basic dependencies
RUN apt-get install -y curl python3.7 python3.7-dev python3.7-distutils build-essential graphviz git-all
# Register the version in alternatives
RUN update-alternatives --install /usr/bin/python python /usr/bin/python3.7 1
## Create User Directory
RUN mkdir -p /home/user
# LIBPOSTAL
# Install Libpostal dependencies
RUN apt-get update &&\
apt-get install -y \
git \
make \
curl \
autoconf \
automake \
libtool \
pkg-config
# Download libpostal source to /usr/local/libpostal
RUN cd /usr/local && \
git clone https://github.com/openvenues/libpostal
# Create Libpostal data directory at /var/libpostal/data
RUN cd /var && \
mkdir libpostal && \
cd libpostal && \
mkdir data
# Install Libpostal from source
RUN cd /usr/local/libpostal && \
./bootstrap.sh && \
./configure --datadir=/var/libpostal/data && \
make -j4 && \
make install && \
ldconfig
# Python Packages
COPY requirements.txt /requirements.txt
# Install Pip Requirements
RUN pip install -r requirements.txt
ENV PYTHONPATH "${PYTHONPATH}:/home/user"
WORKDIR /home/user
requirements.txt
fuzzywuzzy
postal
Created a pipeline.py file with the following Beam pipeline code:
import apache_beam as beam
import argparse
from apache_beam.options.pipeline_options import PipelineOptions
from apache_beam.options.pipeline_options import SetupOptions
from postal.expand import expand_address
from postal.parser import parse_address
from fuzzywuzzy import fuzz
def run(argv=None, save_main_session=True):
parser = argparse.ArgumentParser()
known_args, pipeline_args = parser.parse_known_args(argv)
pipeline_options = PipelineOptions(pipeline_args)
pipeline_options.view_as(SetupOptions).save_main_session = save_main_session
addresses_examples = [
"465 windward pkwy, alpharetta, georgia, u.s.a.",
"2018 colby taylor drive, 40475, richmond, usa",
"19-21 city road, chester ,chester ch1 3ae"
"no.12 lishi hutong, chaoyangmen ; nei nanxiaoj",
"building b, 25 yuan da road haidian district,"
]
class ParseAddress(beam.DoFn):
def process(self, text):
yield parse_address(text)
with beam.Pipeline(options=pipeline_options) as p:
plants = (
p
| 'Adresses' >> beam.Create(addresses_examples)
| 'Parser' >> beam.ParDo(ParseAddress())
| beam.Map(print))
if __name__ == '__main__':
run()
Ran the script using the command:
python3 -m pipeline --runner=PortableRunner --environment_type="DOCKER" --environment_config="beam-text:0.1" --job_endpoint=embed
(beam-text:0.1 is my image name)
But I still receiving the error message:
No module named 'apache_beam'
It sounds that Beam is ignoring my custom container arguments.

Add apache_beam to your requirements.txt. You need apache_beam to be installed both inside and outside of the container.

If you use a virtual env to launch your you Beam job, you need to have the same Python packages installed in your virtual env and in the Docker image (of course also Beam Python extra GCP).
The packages for the virtual env can be managed by the requirements.txt file or tools like Pipenv or Poetry.
The runner from the virtual env will instantiate the job and then in the execution phase, the job will use the Docker image.

Related

Installing a python project using Poetry in a Docker container

I am using Poetry to install a python project using Poetry in a Docker container. Below you can find my Docker file, which used to work fine until recently when I switched to a new version of Poetry (1.2.1) and the new recommended Poetry installer:
# pull official base image
FROM ubuntu:20.04
ENV PATH = "${PATH}:/home/poetry/bin"
ENV APP_HOME=/home/app/web
RUN apt-get -y update && \
apt upgrade -y && \
apt-get install -y \
python3-pip \
curl \
netcat \
gunicorn && \
rm -fr /var/lib/apt/lists
# alias python2 to python3
RUN ln -s /usr/bin/python3 /usr/bin/python
# Install Poetry
RUN mkdir -p /home/poetry && \
curl -sSL https://install.python-poetry.org | POETRY_HOME=/home/poetry python -
# Cleanup
RUN apt-get remove -y curl && \
apt-get clean
RUN pip install --upgrade pip && \
pip install cryptography && \
pip install psycopg2-binary
# create directory for the app user
# create the app user
# create the appropriate directories
RUN adduser --system --group app && \
mkdir -p $APP_HOME/static-incdtim && \
mkdir -p $APP_HOME/mediafiles
# copy project
COPY . $APP_HOME
WORKDIR $APP_HOME
# Install Python packages
RUN poetry config virtualenvs.create false
RUN poetry install --only main
# copy entrypoint-prod.sh
COPY ./entrypoint.incdtim.prod.sh $APP_HOME/entrypoint.sh
RUN chmod a+x $APP_HOME/entrypoint.sh
# chown all the files to the app user
RUN chown -R app:app $APP_HOME
# change to the app user
USER app
# run entrypoint.prod.sh
ENTRYPOINT ["/home/app/web/entrypoint.sh"]
The poetry install works fine, I attached to a running container and run it myself and found that it works without problems. However, when I open a Python console and try to import a module (django) which is installed by the Poetry project, the module is not found. Please note that I am installing my project in the system environment (poetry config virtualenvs.create false). I verified, and there is only one version of python installed in the docker container. The specific error I get when trying to import a python module installed by Poetry is: ModuleNotFoundError: No module named xxxx
Although this is not an answer, it is too long to fit within the comment section. It is rather a piece of advice:
declare your ENV at the top of the Dockerfile to make it easier to read.
merge the multiple RUN commands together to avoid creating useless intermediate layers. In the particular case of apt-get install, this will also prevent you from installing a package which dates back from the first "apt-get update". Indeed, since the command line has not changed Docker will not re-execute the command and thus not refresh the package list..
avoid making a copy of all the files in "." when you previously copy some specific files to specific places..
Here, you Dockerfile could rather look like:
# pull official base image
FROM ubuntu:20.04
ENV PATH = "${PATH}:/home/poetry/bin"
ENV HOME=/home/app
ENV APP_HOME=/home/app/web
RUN apt-get -y update && \
apt upgrade -y && \
apt-get install -y \
python3-pip \
curl \
netcat \
gunicorn && \
rm -fr /var/lib/apt/lists
# alias python2 to python3
RUN ln -s /usr/bin/python3 /usr/bin/python
# Install Poetry
RUN mkdir -p /home/poetry && \
curl -sSL https://install.python-poetry.org | POETRY_HOME=/home/poetry python -
# Cleanup
RUN apt-get remove -y \
curl && \
apt-get clean
RUN pip install --upgrade pip && \
pip install cryptography && \
pip install psycopg2-binary
# create directory for the app user
# create the app user
# create the appropriate directories
RUN mkdir -p /home/app && \
adduser --system --group app && \
mkdir -p $APP_HOME/static-incdtim && \
mkdir -p $APP_HOME/mediafiles
WORKDIR $APP_HOME
# copy project
COPY . $APP_HOME
# Install Python packages
RUN poetry config virtualenvs.create false && \
poetry install --only main
# copy entrypoint-prod.sh
RUN cp $APP_HOME/entrypoint.incdtim.prod.sh $APP_HOME/entrypoint.sh && \
chmod a+x $APP_HOME/entrypoint.sh && \
chown -R app:app $APP_HOME
# change to the app user
USER app
# run entrypoint.prod.sh
ENTRYPOINT ["/home/app/web/entrypoint.sh"]
UPDATE:
Let's get back to your question. Having your program running okay when you "run it yourself" does not mean all the dependencies are met. Indeed, this can mean that your module has not been imported yet (and thus has not triggered the ModuleNotFoundError exception yet).
In order to validate this theory, you can either:
create a simple application which imports the failing module and then quits. If the import succeeds then there is something weird indeed.
list the installed modules with poetry show --latest. If the package is listed, then there is something weird indeed.
If none of the above indicates the module is installed, that just means the module is not installed and you should update your Dockerfile to install it.
NOTE: I do not know much about poetry, but you may want to have a list external dependencies to be met for your application. In the case of pip3, the list is expressed as a file named requirement.txt and can be installed with pip3 install -r requirement.txt.
It turns out this is known a bug in Poetry: https://github.com/python-poetry/poetry/issues/6459

Why is my container when starting as root seem to be empty?

When I get into my container nothing seems to have ebeen installed?
docker pull brandojazz/iit-term-synthesis:test
then
docker run -u root -ti brandojazz/iit-term-synthesis:test_arm bash
see:
(base) root#897a4007076f:/home/bot# opam switch
[WARNING] Running as root is not recommended
[ERROR] Opam has not been initialised, please run `opam init'
it should have been initialized.
FROM continuumio/miniconda3
# FROM --platform=linux/amd64 continuumio/miniconda3
MAINTAINER Brando Miranda "me#gmail.com"
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
ssh \
git \
m4 \
libgmp-dev \
opam \
wget \
ca-certificates \
rsync \
strace \
gcc
# rlwrap \
# sudo
# https://github.com/giampaolo/psutil/pull/2103
RUN useradd -m bot
# format for chpasswd user_name:password
# RUN echo "bot:bot" | chpasswd
# RUN && adduser docker sudo
WORKDIR /home/bot
USER bot
ADD https://api.github.com/repos/IBM/pycoq/git/refs/heads/main version.json
# -- setup opam like VP's PyCoq
RUN opam init --disable-sandboxing
# compiler + '_' + coq_serapi + '.' + coq_serapi_pin
RUN opam switch create ocaml-variants.4.07.1+flambda_coq-serapi.8.11.0+0.11.1 ocaml-variants.4.07.1+flambda
RUN opam switch ocaml-variants.4.07.1+flambda_coq-serapi.8.11.0+0.11.1
RUN eval $(opam env)
RUN opam repo add coq-released https://coq.inria.fr/opam/released
# RUN opam pin add -y coq 8.11.0
# ['opam', 'repo', '--all-switches', 'add', '--set-default', 'coq-released', 'https://coq.inria.fr/opam/released']
RUN opam repo --all-switches add --set-default coq-released https://coq.inria.fr/opam/released
RUN opam update --all
RUN opam pin add -y coq 8.11.0
#RUN opam install -y --switch ocaml-variants.4.07.1+flambda_coq-serapi_coq-serapi_8.11.0+0.11.1 coq-serapi 8.11.0+0.11.1
RUN opam install -y coq-serapi
#RUN eval $(opam env)
#
## makes sure depedencies for pycoq are installed once already in the docker image
#RUN pip install https://github.com/ddelange/psutil/releases/download/release-5.9.1/psutil-5.9.1-cp36-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
#ENV WANDB_API_KEY="SECRET"
#RUN pip install wandb --upgrade
#
#RUN pip install ultimate-utils
## RUN pip install pycoq # do not uncomment on arm, unless serlib is removed from setup.py in the pypi pycoq version.
## RUN pip install ~/iit-term-synthesis # likely won't work cuz we don't have iit or have pused it to pypi
#
## then make sure editable mode is done to be able to use changing pycoq from system
#RUN echo "pip install -e /home/bot/ultimate-utils" >> ~/.bashrc
#RUN echo "pip install -e /home/bot/pycoq" >> ~/.bashrc
#RUN echo "pip install -e /home/bot/iit-term-synthesis" >> ~/.bashrc
#RUN echo "pip install wandb --upgrade" >> ~/.bashrc
#
#RUN echo "eval $(opam env)" >> ~/.bashrc
## - set env variable for bash terminal prompt p1 to be nicely colored
#ENV force_color_prompt=yes
#
#RUN mkdir -p /home/bot/data/
# RUN pytest --pyargs pycoq
#CMD /bin/bash
NB: This may not be your only problem (I have no idea what opam is or how it works), but one thing jumps out:
This...
RUN eval $(opam env)
...doesn't do anything. Each RUN invocation happens in a new subshell; environment variables set in one RUN command aren't going to be visible in a subsequent RUN command.
Rather than a list of single-command RUN commands, chain everything together in a single command:
RUN eval $(opam env) && \
opam repo add coq-released https://coq.inria.fr/opam/released && \
opam repo --all-switches add --set-default coq-released https://coq.inria.fr/opam/released && \
opam update --all && \
opam pin add -y coq 8.11.0 && \
opam install -y coq-serapi
Because the above runs in a single shell, the environment set by eval $(opam env) will be available to all the following commands.

Errors Installing singularity inside dockerfile

I am trying to run a nextflow pipeline which uses an older version of nextflow (21.04.3) and java version 8. Since I have to use this pipeline on a remote server, therefore I can only use singularity.
As this nextflow pipeline also uses singularity pull calls therefore I need the singularity installed inside the docker image as well. Then, I can convert this image docker image to a singularity image and then I can move it to the remote server.
I am trying to install singularity inside dockerfile but I am getting errors,
This is the dockerfile that I am using,
FROM python:3.8.9-slim
LABEL authors="phil.ewels#scilifelab.se,erik.danielsson#scilifelab.se" \
description="Docker image containing requirements for the nfcore tools"
# Do not pick up python packages from $HOME
ENV PYTHONNUSERSITE=1
# Update pip to latest version
RUN python -m pip install --upgrade pip
# Install dependencies
COPY requirements.txt requirements.txt
RUN python -m pip install -r requirements.txt
# Install Nextflow dependencies
RUN apt-get update \
&& apt-get upgrade -y \
&& apt-get install -y git \
&& apt-get install -y wget
# Create man dir required for Java installation
# and install Java
RUN mkdir -p /usr/share/man/man1 \
&& apt-get install -y openjdk-11-jre \
&& apt-get clean -y && rm -rf /var/lib/apt/lists/*
# Install Singularity
RUN wget -O- http://neuro.debian.net/lists/xenial.us-ca.full | tee /etc/apt/sources.list.d/neurodebian.sources.list && \ apt-key adv --recv-keys --keyserver hkp://pool.sks-keyservers.net:80 0xA5D32F012649A5A9 && \ apt-get update
RUN apt-get install -y singularity-container
# Setup ARG for NXF_VER ENV
ARG NXF_VER=""
ENV NXF_VER ${NXF_VER}
# Install Nextflow
RUN wget https://github.com/nextflow- io/nextflow/releases/download/v21.04.3/nextflow | bash \
&& mv nextflow /usr/local/bin \
&& chmod a+rx /usr/local/bin/nextflow
# Add the nf-core source files to the image
COPY . /usr/src/nf_core
WORKDIR /usr/src/nf_core
# Install nf-core
RUN python -m pip install .
# Set up entrypoint and cmd for easy docker usage
CMD [ "." ]
These are the errors I am getting
Step 9/17 : RUN wget -O- http://neuro.debian.net/lists/xenial.us-ca.full | tee
/etc/apt/sources.list.d/neurodebian.sources.list && \ apt-key adv --recv-keys --
keyserver hkp://pool.sks-keyservers.net:80 0xA5D32F012649A5A9 && \ apt-get update
---> Running in afc3dcbbd1ee
--2022-03-17 17:40:19-- http://neuro.debian.net/lists/xenial.us-ca.full
Resolving neuro.debian.net (neuro.debian.net)... 129.170.233.11
Connecting to neuro.debian.net (neuro.debian.net)|129.170.233.11|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 262
Saving to: ‘STDOUT’
0K 100% 18.4M=0s
deb http://neurodeb.pirsquared.org data main contrib non-free
#deb-src http://neurodeb.pirsquared.org data main contrib non-free
deb http://neurodeb.pirsquared.org xenial main contrib non-free
#deb-src http://neurodeb.pirsquared.org xenial main contrib non-free
2022-03-17 17:40:19 (18.4 MB/s) - written to stdout [262/262]
/bin/sh: 1: apt-key: not found
The command '/bin/sh -c wget -O- http://neuro.debian.net/lists/xenial.us-ca.full | tee /etc/apt/sources.list.d/neurodebian.sources.list && \ apt-key adv --recv-keys --keyserver hkp://pool.sks-keyservers.net:80 0xA5D32F012649A5A9 && \ apt-get update'
returned a non-zero code: 127
I there a way to install singularity using a dockerfile ?
Thanks
I made some changes in the dockerfile based on the method to install singularity in linux given here.
The complete dockerfile with which I was able to run successfully nextflow, java and singularity within singularity is given below,
FROM python:3.8.9-slim
LABEL
authors="phil.ewels#scilifelab.se,erik.danielsson#scilifelab.se" \
description="Docker image containing requirements for the nfcore tools"
# Do not pick up python packages from $HOME
ENV PYTHONNUSERSITE=1
# Update pip to latest version
RUN python -m pip install --upgrade pip
# Install dependencies
COPY requirements.txt requirements.txt
RUN python -m pip install -r requirements.txt
# Install Nextflow dependencies
RUN apt-get update \
&& apt-get upgrade -y \
&& apt-get install -y git \
&& apt-get install -y wget
# Create man dir required for Java installation
# and install Java
RUN mkdir -p /usr/share/man/man1 \
&& apt-get install -y openjdk-11-jre \
&& apt-get clean -y && rm -rf /var/lib/apt/lists/*
# Install Singularity
RUN apt-get update && apt-get install -y \
build-essential \
libssl-dev \
uuid-dev \
libgpgme11-dev \
squashfs-tools \
libseccomp-dev \
wget \
pkg-config \
procps
# Download Go source version 1.16.3, install them and modify the PATH
ENV VERSION=1.16.3
ENV OS=linux
ENV ARCH=amd64
RUN wget https://dl.google.com/go/go$VERSION.$OS-$ARCH.tar.gz && \
tar -C /usr/local -xzvf go$VERSION.$OS-$ARCH.tar.gz && \
rm go$VERSION.$OS-$ARCH.tar.gz && \
echo 'export PATH=$PATH:/usr/local/go/bin' | tee -a /etc/profile
# Download Singularity from version 3.7.3 (security version)
ENV VERSION=3.7.3
RUN wget https://github.com/sylabs/singularity/releases/download/v${VERSION}/singularity-${VERSION}.tar.gz && \
tar -xzf singularity-${VERSION}.tar.gz
# Compile Singularity sources and install it
RUN export PATH=$PATH:/usr/local/go/bin && \
cd singularity && \
./mconfig --without-suid && \
make -C ./builddir && \
make -C ./builddir install
# Setup ARG for NXF_VER ENV
ARG NXF_VER=""
ENV NXF_VER ${NXF_VER}
# Install Nextflow
RUN wget https://github.com/nextflow-io/nextflow/releases/download/v21.04.3/nextflow | bash \
&& mv nextflow /usr/local/bin \
&& chmod a+rx /usr/local/bin/nextflow
# Add the nf-core source files to the image
COPY . /usr/src/nf_core
WORKDIR /usr/src/nf_core
# Install nf-core
RUN python -m pip install .
# Set up entrypoint and cmd for easy docker usage
CMD [ "." ]
The file named requirements.txt used in the above dockerfile is given below,
click
GitPython
jinja2
jsonschema
packaging
prompt_toolkit>=3.0.3
pyyaml
pytest-workflow
questionary>=1.8.0
requests_cache
requests
rich>=10.0.0
tabulate

Dockerfile: Python3 not found

I am trying to convert a bash script to a Dockerfile since we are going the containerization route with AWS Batch
Basically I install CPLEX (an optimization library) and Anaconda, install some related packages, check if my environment it good to go, and then kick off a shell script to run the batch job.
Here is a snippet of my Dockerfile:
FROM amazonlinux:latest
# Download packages for container
RUN yum update -y
RUN yum -y install which unzip aws-cli \
RUN yum install -y tar.x86_64
RUN yum install gzip -y
RUN yum install ncompress -y
RUN yum -y install wget
RUN yum install -y nano
# Set working directory
WORKDIR /setup
#: Copy CPLEX installer binary and installation script.
COPY cplex_odee1210.linux-x86-64.bin /setup/
COPY cplex_installer_input.sh /setup/
#: Install CPLEX and update .bashrc
RUN chmod +x /setup/cplex_odee1210.linux-x86-64.bin
RUN chmod +x cplex_installer_input.sh
RUN ./cplex_installer_input.sh | bash cplex_odee1210.linux-x86-64.bin
RUN echo 'export PATH=$PATH:/opt/ibm/ILOG/CPLEX_Optimizer1210/cplex/bin/x86-64_linux' >>/root/.bashrc \
&& /bin/bash -c "source ~/.bashrc"
ENV PATH $PATH:/opt/ibm/ILOG/CPLEX_Optimizer1210/cplex/bin/x86-64_linux
#: Download Anaconda
COPY Anaconda3-2019.10-Linux-x86_64.sh /setup/
RUN bash Anaconda3-2019.10-Linux-x86_64.sh -b -p /home/ec2-user/anaconda3
RUN echo 'export PATH=$PATH:/home/ec2-user/anaconda3/bin' >>/root/.bashrc \
&& /bin/bash -c "source ~/.bashrc"
ENV PATH $PATH:/home/ec2-user/anaconda3/bin
RUN conda install pandas -y \
&& conda install numpy -y \
&& conda install ujson -y \
&& pip install docplex \
&& pip install boto3 \
&& pip install grpcio \
&& pip install grpcio-tools
RUN python3 -m docplex.mp.environment
ADD fetch_and_run.sh /usr/local/bin/fetch_and_run.sh
ENTRYPOINT ["/usr/local/bin/fetch_and_run.sh"]
From there, I kick off a bash script
#!/bin/bash
date
echo "Args: $#"
env
echo "script_path: $1"
echo "script_name: $2"
echo "path_prefix: $3"
echo "jobID: $AWS_BATCH_JOB_ID"
echo "jobQueue: $AWS_BATCH_JQ_NAME"
echo "computeEnvironment: $AWS_BATCH_CE_NAME"
echo "current directory: $(pwd)"
mkdir /tmp/scripts/
aws s3 cp $1 /tmp/scripts/$2
python3 /tmp/scripts/${#:2}
But for some reason, I keep getting
/tmp/tmp.hQlWYBEFs/batch-file-temp: line 20: python3: command not found
Do I need to change some PATH variables? Why isn't Docker picking up my Python 3 version?
The image needs to have python3 installed. Building images works off of files and programs that exist in the container. The python3 you have installed on your own system is not available.

Cannot launch gunicorn flask app with torch model on the docker

Does anyone has the working example of docker that uses GPU, torch, gunicorn and flask in the one application? Torch 1.4.0 throws an exception. Please find below the configuration
Dockerfile:
FROM nvidia/cuda:10.2-base-ubuntu18.04
# Install some basic utilities
RUN apt-get update && apt-get install -y \
curl \
ca-certificates \
sudo \
git \
bzip2 \
libx11-6 \
&& rm -rf /var/lib/apt/lists/*
# Create a working directory
RUN mkdir /app
WORKDIR /app
RUN apt-get update
RUN apt-get install -y curl python3.7 python3.7-dev python3.7-distutils
# Register the version in alternatives
RUN update-alternatives --install /usr/bin/python python /usr/bin/python3.7 1
# Set python 3 as the default python
RUN update-alternatives --set python /usr/bin/python3.7
# Upgrade pip to latest version
RUN curl -s https://bootstrap.pypa.io/get-pip.py -o get-pip.py && \
python get-pip.py --force-reinstall && \
rm get-pip.py
# Set the default command to python3
COPY requirements.txt .
RUN pip --no-cache-dir install -r requirements.txt
RUN pip install torch torchvision
WORKDIR /usr/src/app
COPY . ./
CMD python ./new_main.py --workers 1
And the new_main.py:
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument("--test", action='store_true')
parser.add_argument("--workers", type=int, default=1)
args = parser.parse_args()
if check_test_mode(args.test):
number_of_GPU_workers = args.workers or 1
options = {
'bind': '%s:%s' % ('0.0.0.0', str(port)),
'workers': number_of_GPU_workers,
'timeout': 300
}
StandaloneApplication(app, options).run()
init()
The route I am using:
#app.route("/api/work", methods=["POST"])
def work():
try:
body = request.get_json()
if app.worker is None:
app.worker = worker()
app.worker.load_models()
...
And here it throws the exception:
2020-04-09 11:33:33,544 loading file /mnt/models/best-model
Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
Command that I am using:
sudo docker run -p 8889:8888 -e MODELSLOCATION=/mnt/models --gpus all -v $MODELSLOCATION:/mnt/models cc14ffc68256
For torch 1.4.0, the solution that works for me is the following. You need to launch the flask app in the separate function init.
What is most important - you need to put import torch inside this function and remove any occurrences of if from the flask launch file. Torch 1.4.0 has some issues with multiprocessing.
def init():
if __name__ == '__main__':
import torch
parser = argparse.ArgumentParser()
parser.add_argument("--test", action='store_true')
parser.add_argument("--workers", type=int, default=1)
args = parser.parse_args()
torch.multiprocessing.set_start_method('spawn')
if check_test_mode(args.test):
number_of_GPU_workers = args.workers or 1
options = {
'bind': '%s:%s' % ('0.0.0.0', str(port)),
'workers': number_of_GPU_workers,
'timeout': 900
}
StandaloneApplication(app, options).run()
init()

Resources