Error while importing torch inside a Docker image inside a VM - docker

What I have:
I have set up an Ubuntu VM using Vagrant. Inside this VM, I want to build a Docker Image, which should run some services, which will be connected to some clients outside the VM. This structure is fixed and cannot be changed. One of the Docker images is using ML frameworks, namely tensorflow and pytorch. The source code to be executed inside the Docker image is bundled using pyInstaller. The building and bundling works perfectly.
But, if I try to run the built Docker image, I get the following error message:
[1] WARNING: file already exists but should not: /tmp/_MEIl2gg3t/torch/_C.cpython-37m-x86_64-linux-gnu.so
[1] WARNING: file already exists but should not: /tmp/_MEIl2gg3t/torch/_dl.cpython-37m-x86_64-linux-gnu.so
['/tmp/_MEIl2gg3t/base_library.zip', '/tmp/_MEIl2gg3t/lib-dynload', '/tmp/_MEIl2gg3t']
[8] Failed to execute script '__main__' due to unhandled exception!
Traceback (most recent call last):
File "__main__.py", line 4, in <module>
File "PyInstaller/loader/pyimod03_importers.py", line 495, in exec_module
File "app.py", line 6, in <module>
File "PyInstaller/loader/pyimod03_importers.py", line 495, in exec_module
File "controller.py", line 3, in <module>
File "PyInstaller/loader/pyimod03_importers.py", line 495, in exec_module
File "torch/__init__.py", line 199, in <module>
ImportError: librt.so.1: cannot open shared object file: No such file or directory
Dockerfile
ARG PRJ=unspecified
ARG PYINSTALLER_ARGS=
ARG LD_LIBRARY_PATH_EXTENSION=
ARG PYTHON_VERSION=3.7
###############################################################################
# Stage 1: BUILD PyInstaller
###############################################################################
# Alpine:
#FROM ... as build-pyinstaller
# Ubuntu:
FROM ubuntu:18.04 as build-pyinstaller
ARG PYTHON_VERSION
# Ubuntu:
RUN apt-get update && apt-get install -y \
python$PYTHON_VERSION \
python$PYTHON_VERSION-dev \
python3-pip \
unzip \
# Ubuntu+Alpine:
libc-dev \
g++ \
git
# Make our Python version the default
RUN update-alternatives --install /usr/bin/python3 python3 /usr/bin/python$PYTHON_VERSION 1 && python3 --version
# Alpine:
#
# # Install pycrypto so --key can be used with PyInstaller
# RUN pip install \
# pycrypto
# Install PyInstaller
RUN python3 -m pip install --proxy=${https_proxy} --no-cache-dir \
pyinstaller
###############################################################################
# Stage 2: BUILD our service with Python and pyinstaller
###############################################################################
FROM build-pyinstaller
# Upgrade pip and setuptools
RUN python3 -m pip install --no-cache-dir --upgrade \
pip \
setuptools
# Install pika and protobuf here as they will be required by all our services,
# and installing in every image would take more time.
# If they should no longer be required everywhere, we could instead create
# with-pika and with-protobuf images and copy the required, installed libraries
# to the final build image (similar to how it is done in cpp).
RUN python3 -m pip install --no-cache-dir \
pika \
protobuf
# Add "worker" user to avoid running as root (used in the "run" image below)
# Alpine:
#RUN adduser -D -g "" worker
# Ubuntu:
RUN adduser --disabled-login --gecos "" worker
RUN mkdir -p /opt/export/home/worker && chown -R worker /opt/export/home/worker
ENV HOME /home/worker
# Copy /etc/passwd and /etc/group to the export directory so that they will be installed in the final run image
# (this makes the "worker" user available there; adduser is not available in "FROM scratch").
RUN export-install \
/etc/passwd \
/etc/group
# Create tmp directory that may be required in the runner image
RUN mkdir /opt/export/install/tmp && chmod ogu+rw /opt/export/install/tmp
# When using this build-parent ("FROM ..."), the following ONBUILD commands are executed.
# Files from pre-defined places in the local project directory are copied to the image (see below for details).
# Use the PRJ and MAIN_MODULE arguments that have to be set in the individual builder image that uses this image in FROM ...
ONBUILD ARG PRJ
ONBUILD ENV PRJ=embedded.adas.emergencybreaking
ONBUILD WORKDIR /opt/prj/embedded.adas.emergencybreaking/
# "prj" must contain all files that are required for building the Python app.
# This typically contains a requirements.txt - in this step we only copy requirements.txt
# so that "pip install" is not run after every source file change.
ONBUILD COPY pr[j]/requirements.tx[t] /opt/prj/embedded.adas.emergencybreaking/
# Install required python dependencies for our service - the result stored in a separate image layer
# which is used as cache in the next build even if the source files were changed (those are copied in one of the next steps).
ONBUILD RUN python3 -m pip install --no-cache-dir -r /opt/prj/embedded.adas.emergencybreaking/requirements.txt
# Install all linux packages that are listed in /opt/export/build/opt/prj/*/install-packages.txt
# and /opt/prj/*/install-packages.txt
ONBUILD COPY .placeholder pr[j]/install-packages.tx[t] /opt/prj/embedded.adas.emergencybreaking/
ONBUILD RUN install-build-packages
# "prj" must contain all files that are required for building the Python app.
# This typically contains a dependencies/lib directory - in this step we only copy that directory
# so that "pip install" is not run after every source file change.
ONBUILD COPY pr[j]/dependencie[s]/li[b] /opt/prj/embedded.adas.emergencybreaking/dependencies/lib
# .egg/.whl archives can contain binary .so files which can be linked to system libraries.
# We need to copy the system libraries that are linked from .so files in .whl/.egg packages.
# (Maybe Py)
ONBUILD RUN \
for lib_file in /opt/prj/embedded.adas.emergencybreaking/dependencies/lib/*.whl /opt/prj/embedded.adas.emergencybreaking/dependencies/lib/*.egg; do \
if [ -e "$lib_file" ]; then \
mkdir -p /tmp/lib; \
cd /tmp/lib; \
unzip $lib_file "*.so"; \
find /tmp/lib -iname "*.so" -exec ldd {} \; ; \
linked_libs=$( ( find /tmp/lib -iname "*.so" -exec get-linked-libs {} \; ) | egrep -v "^/tmp/lib/" ); \
export-install $linked_libs; \
cd -; \
rm -rf /tmp/lib; \
fi \
done
# Install required python dependencies for our service - the result is stored in a separate image layer
# which can be used as cache in the next build even if the source files are changed (those are copied in one of the next steps).
ONBUILD RUN \
for lib_file in /opt/prj/embedded.adas.emergencybreaking/dependencies/lib/*.whl; do \
[ -e "$lib_file" ] || continue; \
\
echo "python3 -m pip install --no-cache-dir $lib_file" && \
python3 -m pip install --no-cache-dir $lib_file; \
done
ONBUILD RUN \
for lib_file in /opt/prj/embedded.adas.emergencybreaking/dependencies/lib/*.egg; do \
[ -e "$lib_file" ] || continue; \
\
# Note: This will probably not work any more as easy_install is no longer contained in setuptools!
echo "python3 -m easy_install $lib_file" && \
python3 -m easy_install $lib_file; \
done
# Copy the rest of the prj directory.
ONBUILD COPY pr[j] /opt/prj/embedded.adas.emergencybreaking/
# Show what files we are working on
ONBUILD RUN find /opt/prj/embedded.adas.emergencybreaking/ -type f
# Create an executable with PyInstaller so that python does not need to be installed in the "run" image.
# This produces a lot of error messages like this:
# Error relocating /usr/local/lib/python3.8/lib-dynload/_uuid.cpython-38-x86_64-linux-gnu.so: PyModule_Create2: symbol not found
# If the reported functions/symbols are called from our python service, the missing dependencies probably have to be installed.
ONBUILD ARG PYINSTALLER_ARGS
ONBUILD ENV PYINSTALLER_ARGS=${PYINSTALLER_ARGS}
ONBUILD ARG LD_LIBRARY_PATH_EXTENSION
ONBUILD ENV LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${LD_LIBRARY_PATH_EXTENSION}
ONBUILD RUN mkdir -p /usr/lib64 # Workaround for FileNotFoundError: [Errno 2] No such file or directory: '/usr/lib64' from pyinstaller
ONBUILD RUN \
apt-get update && \
apt-get install -y \
libgl1-mesa-glx \
libx11-xcb1 && \
apt-get clean all && \
rm -r /var/lib/apt/lists/* && \
echo "LD_LIBRARY_PATH=${LD_LIBRARY_PATH}" && \
echo "pyinstaller -p /opt/prj/embedded.adas.emergencybreaking/src -p /opt/prj/embedded.adas.emergencybreaking/dependencies/src -p /usr/local/lib/python3.7/dist-packages --hidden-import=torch --hidden-import=torchvision --onefile ${PYINSTALLER_ARGS} /opt/prj/embedded.adas.emergencybreaking/src/adas_emergencybreaking/__main__.py" && \
pyinstaller -p /opt/prj/embedded.adas.emergencybreaking/src -p /opt/prj/embedded.adas.emergencybreaking/dependencies/src -p /usr/local/lib/python3.7/dist-packages --hidden-import=torch --hidden-import=torchvision --onefile ${PYINSTALLER_ARGS} /opt/prj/embedded.adas.emergencybreaking/src/adas_emergencybreaking/__main__.py ; \
# Maybe we will need to add additional paths with -p ...
# Copy the runnable to our default location /opt/run/app
ONBUILD RUN mkdir -p /opt/run && \
cp -p -v /opt/prj/embedded.adas.emergencybreaking/dist/__main__ /opt/run/app
# Show linked libraries (as static linking does not work yet these have to be copied to the "run" image below)
#ONBUILD RUN get-linked-libs /usr/local/lib/libpython*.so.*
#ONBUILD RUN get-linked-libs /opt/run/app
# Add the executable and all linked libraries to the export/install directory
# so that they will be copied to the final "run" image
ONBUILD RUN export-install $( get-linked-libs /opt/run/app )
# Show what we have produced
ONBUILD RUN find /opt/export -type f
The requirements.txt, which is used to install my dependencies looks like this:
numpy
tensorflow-cpu
matplotlib
--find-links https://download.pytorch.org/whl/torch_stable.html
torch==1.11.0+cpu
--find-links https://download.pytorch.org/whl/torch_stable.html
torchvision==0.12.0+cpu
Is there anything obviously wrong here?

Related

Passing "vars" to dbt-snowflake docker container image is throwing errors

I'm running dbt-snowflake docker image and need to pass parameters while running the container. So, I tried passing --vars from the command prompt. But getting below error.
12:54:44 Running with dbt=1.3.1
12:54:45 Encountered an error:
'dbt_snowflake://macros/apply_grants.sql'
12:54:45 Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/dbt/main.py", line 135, in main
results, succeeded = handle_and_check(args)
File "/usr/local/lib/python3.10/site-packages/dbt/main.py", line 198, in handle_and_check
task, res = run_from_args(parsed)
File "/usr/local/lib/python3.10/site-packages/dbt/main.py", line 245, in run_from_args
results = task.run()
File "/usr/local/lib/python3.10/site-packages/dbt/task/runnable.py", line 453, in run
self._runtime_initialize()
File "/usr/local/lib/python3.10/site-packages/dbt/task/runnable.py", line 161, in _runtime_initialize
super()._runtime_initialize()
File "/usr/local/lib/python3.10/site-packages/dbt/task/runnable.py", line 94, in _runtime_initialize
self.load_manifest()
File "/usr/local/lib/python3.10/site-packages/dbt/task/runnable.py", line 81, in load_manifest
self.manifest = ManifestLoader.get_full_manifest(self.config)
File "/usr/local/lib/python3.10/site-packages/dbt/parser/manifest.py", line 221, in get_full_manifest
manifest = loader.load()
File "/usr/local/lib/python3.10/site-packages/dbt/parser/manifest.py", line 320, in load
self.load_and_parse_macros(project_parser_files)
File "/usr/local/lib/python3.10/site-packages/dbt/parser/manifest.py", line 422, in load_and_parse_macros
block = FileBlock(self.manifest.files[file_id])
KeyError: 'dbt_snowflake://macros/apply_grants.sql'
Below is my docker file
# Top level build args
ARG build_for=linux/amd64
##
# base image (abstract)
##
FROM --platform=$build_for python:3.10.7-slim-bullseye as base
ARG dbt_core_ref=dbt-core#v1.4.0a1
ARG dbt_postgres_ref=dbt-core#v1.4.0a1
ARG dbt_redshift_ref=dbt-redshift#v1.4.0a1
ARG dbt_bigquery_ref=dbt-bigquery#v1.4.0a1
ARG dbt_snowflake_ref=dbt-snowflake#v1.3.0
ARG dbt_spark_ref=dbt-spark#v1.4.0a1
# special case args
ARG dbt_spark_version=all
ARG dbt_third_party
# System setup
RUN apt-get update \
&& apt-get dist-upgrade -y \
&& apt-get install -y --no-install-recommends \
git \
ssh-client \
software-properties-common \
make \
build-essential \
ca-certificates \
libpq-dev \
&& apt-get clean \
&& rm -rf \
/var/lib/apt/lists/* \
/tmp/* \
/var/tmp/*
# Env vars
ENV PYTHONIOENCODING=utf-8
ENV LANG=C.UTF-8
# Update python
RUN python -m pip install --upgrade pip setuptools wheel --no-cache-dir
RUN pip install -q --no-cache-dir dbt-core
RUN pip install -q --no-cache-dir dbt-snowflake
# RUN mkdir /root/.dbt
# ADD profiles.yml /root/.dbt
# Set docker basics
WORKDIR /usr/app/dbt/
VOLUME /usr/app
COPY **/profiles.yml /root/.dbt/profiles.yml
COPY . /usr/app/dbt/
ENTRYPOINT ["dbt"]
Here is my docker image docker pull madhuraju/gu-snowflake
Below is the command
docker run -it gu-snowflake:test run --vars '{"testKey": "testValue"}'
Please let me know how can I fix this issue and also how I can pass values at runtime so that dbt executes only specific models based on the values that are being passed.

Installing a python project using Poetry in a Docker container

I am using Poetry to install a python project using Poetry in a Docker container. Below you can find my Docker file, which used to work fine until recently when I switched to a new version of Poetry (1.2.1) and the new recommended Poetry installer:
# pull official base image
FROM ubuntu:20.04
ENV PATH = "${PATH}:/home/poetry/bin"
ENV APP_HOME=/home/app/web
RUN apt-get -y update && \
apt upgrade -y && \
apt-get install -y \
python3-pip \
curl \
netcat \
gunicorn && \
rm -fr /var/lib/apt/lists
# alias python2 to python3
RUN ln -s /usr/bin/python3 /usr/bin/python
# Install Poetry
RUN mkdir -p /home/poetry && \
curl -sSL https://install.python-poetry.org | POETRY_HOME=/home/poetry python -
# Cleanup
RUN apt-get remove -y curl && \
apt-get clean
RUN pip install --upgrade pip && \
pip install cryptography && \
pip install psycopg2-binary
# create directory for the app user
# create the app user
# create the appropriate directories
RUN adduser --system --group app && \
mkdir -p $APP_HOME/static-incdtim && \
mkdir -p $APP_HOME/mediafiles
# copy project
COPY . $APP_HOME
WORKDIR $APP_HOME
# Install Python packages
RUN poetry config virtualenvs.create false
RUN poetry install --only main
# copy entrypoint-prod.sh
COPY ./entrypoint.incdtim.prod.sh $APP_HOME/entrypoint.sh
RUN chmod a+x $APP_HOME/entrypoint.sh
# chown all the files to the app user
RUN chown -R app:app $APP_HOME
# change to the app user
USER app
# run entrypoint.prod.sh
ENTRYPOINT ["/home/app/web/entrypoint.sh"]
The poetry install works fine, I attached to a running container and run it myself and found that it works without problems. However, when I open a Python console and try to import a module (django) which is installed by the Poetry project, the module is not found. Please note that I am installing my project in the system environment (poetry config virtualenvs.create false). I verified, and there is only one version of python installed in the docker container. The specific error I get when trying to import a python module installed by Poetry is: ModuleNotFoundError: No module named xxxx
Although this is not an answer, it is too long to fit within the comment section. It is rather a piece of advice:
declare your ENV at the top of the Dockerfile to make it easier to read.
merge the multiple RUN commands together to avoid creating useless intermediate layers. In the particular case of apt-get install, this will also prevent you from installing a package which dates back from the first "apt-get update". Indeed, since the command line has not changed Docker will not re-execute the command and thus not refresh the package list..
avoid making a copy of all the files in "." when you previously copy some specific files to specific places..
Here, you Dockerfile could rather look like:
# pull official base image
FROM ubuntu:20.04
ENV PATH = "${PATH}:/home/poetry/bin"
ENV HOME=/home/app
ENV APP_HOME=/home/app/web
RUN apt-get -y update && \
apt upgrade -y && \
apt-get install -y \
python3-pip \
curl \
netcat \
gunicorn && \
rm -fr /var/lib/apt/lists
# alias python2 to python3
RUN ln -s /usr/bin/python3 /usr/bin/python
# Install Poetry
RUN mkdir -p /home/poetry && \
curl -sSL https://install.python-poetry.org | POETRY_HOME=/home/poetry python -
# Cleanup
RUN apt-get remove -y \
curl && \
apt-get clean
RUN pip install --upgrade pip && \
pip install cryptography && \
pip install psycopg2-binary
# create directory for the app user
# create the app user
# create the appropriate directories
RUN mkdir -p /home/app && \
adduser --system --group app && \
mkdir -p $APP_HOME/static-incdtim && \
mkdir -p $APP_HOME/mediafiles
WORKDIR $APP_HOME
# copy project
COPY . $APP_HOME
# Install Python packages
RUN poetry config virtualenvs.create false && \
poetry install --only main
# copy entrypoint-prod.sh
RUN cp $APP_HOME/entrypoint.incdtim.prod.sh $APP_HOME/entrypoint.sh && \
chmod a+x $APP_HOME/entrypoint.sh && \
chown -R app:app $APP_HOME
# change to the app user
USER app
# run entrypoint.prod.sh
ENTRYPOINT ["/home/app/web/entrypoint.sh"]
UPDATE:
Let's get back to your question. Having your program running okay when you "run it yourself" does not mean all the dependencies are met. Indeed, this can mean that your module has not been imported yet (and thus has not triggered the ModuleNotFoundError exception yet).
In order to validate this theory, you can either:
create a simple application which imports the failing module and then quits. If the import succeeds then there is something weird indeed.
list the installed modules with poetry show --latest. If the package is listed, then there is something weird indeed.
If none of the above indicates the module is installed, that just means the module is not installed and you should update your Dockerfile to install it.
NOTE: I do not know much about poetry, but you may want to have a list external dependencies to be met for your application. In the case of pip3, the list is expressed as a file named requirement.txt and can be installed with pip3 install -r requirement.txt.
It turns out this is known a bug in Poetry: https://github.com/python-poetry/poetry/issues/6459

A Dockerfile with 2 ENTRYPOINT

I am learning about docker, specificially how to write docker file. Recently I came across this one and couldn't understand why there are 2 ENTRYPOINT in it.
The original file is in this link CosmWasm/rust-optimizer Dockerfile. The code bellow is its current actual content.
FROM rust:1.60.0-alpine as targetarch
ARG BUILDPLATFORM
ARG TARGETPLATFORM
ARG TARGETARCH
ARG BINARYEN_VERSION="version_105"
RUN echo "Running on $BUILDPLATFORM, building for $TARGETPLATFORM"
# AMD64
FROM targetarch as builder-amd64
ARG ARCH="x86_64"
# ARM64
FROM targetarch as builder-arm64
ARG ARCH="aarch64"
# GENERIC
FROM builder-${TARGETARCH} as builder
# Download binaryen sources
ADD https://github.com/WebAssembly/binaryen/archive/refs/tags/$BINARYEN_VERSION.tar.gz /tmp/binaryen.tar.gz
# Extract and compile wasm-opt
# Adapted from https://github.com/WebAssembly/binaryen/blob/main/.github/workflows/build_release.yml
RUN apk update && apk add build-base cmake git python3 clang ninja
RUN tar -xf /tmp/binaryen.tar.gz
RUN cd binaryen-version_*/ && cmake . -G Ninja -DCMAKE_CXX_FLAGS="-static" -DCMAKE_C_FLAGS="-static" -DCMAKE_BUILD_TYPE=Release -DBUILD_STATIC_LIB=ON && ninja wasm-opt
# Run tests
RUN cd binaryen-version_*/ && ninja wasm-as wasm-dis
RUN cd binaryen-version_*/ && python3 check.py wasm-opt
# Install wasm-opt
RUN strip binaryen-version_*/bin/wasm-opt
RUN mv binaryen-version_*/bin/wasm-opt /usr/local/bin
# Check cargo version
RUN cargo --version
# Check wasm-opt version
RUN wasm-opt --version
# Download sccache and verify checksum
ADD https://github.com/mozilla/sccache/releases/download/v0.2.15/sccache-v0.2.15-$ARCH-unknown-linux-musl.tar.gz /tmp/sccache.tar.gz
RUN sha256sum /tmp/sccache.tar.gz | egrep '(e5d03a9aa3b9fac7e490391bbe22d4f42c840d31ef9eaf127a03101930cbb7ca|90d91d21a767e3f558196dbd52395f6475c08de5c4951a4c8049575fa6894489)'
# Extract and install sccache
RUN tar -xf /tmp/sccache.tar.gz
RUN mv sccache-v*/sccache /usr/local/bin/sccache
RUN chmod +x /usr/local/bin/sccache
# Check sccache version
RUN sccache --version
# Add scripts
ADD optimize.sh /usr/local/bin/optimize.sh
RUN chmod +x /usr/local/bin/optimize.sh
ADD optimize_workspace.sh /usr/local/bin/
RUN chmod +x /usr/local/bin/optimize_workspace.sh
# Being required for gcc linking of build_workspace
RUN apk add --no-cache musl-dev
ADD build_workspace build_workspace
RUN cd build_workspace && \
echo "Installed targets:" && (rustup target list | grep installed) && \
export DEFAULT_TARGET="$(rustc -vV | grep 'host:' | cut -d' ' -f2)" && echo "Default target: $DEFAULT_TARGET" && \
# Those RUSTFLAGS reduce binary size from 4MB to 600 KB
RUSTFLAGS='-C link-arg=-s' cargo build --release && \
ls -lh target/release/build_workspace && \
(ldd target/release/build_workspace || true) && \
mv target/release/build_workspace /usr/local/bin
#
# base-optimizer
#
FROM rust:1.60.0-alpine as base-optimizer
# Being required for gcc linking
RUN apk update && \
apk add --no-cache musl-dev
# Setup Rust with Wasm support
RUN rustup target add wasm32-unknown-unknown
# Add wasm-opt
COPY --from=builder /usr/local/bin/wasm-opt /usr/local/bin
#
# rust-optimizer
#
FROM base-optimizer as rust-optimizer
# Use sccache. Users can override this variable to disable caching.
COPY --from=builder /usr/local/bin/sccache /usr/local/bin
ENV RUSTC_WRAPPER=sccache
# Assume we mount the source code in /code
WORKDIR /code
# Add script as entry point
COPY --from=builder /usr/local/bin/optimize.sh /usr/local/bin
ENTRYPOINT ["optimize.sh"]
# Default argument when none is provided
CMD ["."]
#
# workspace-optimizer
#
FROM base-optimizer as workspace-optimizer
# Assume we mount the source code in /code
WORKDIR /code
# Add script as entry point
COPY --from=builder /usr/local/bin/optimize_workspace.sh /usr/local/bin
COPY --from=builder /usr/local/bin/build_workspace /usr/local/bin
ENTRYPOINT ["optimize_workspace.sh"]
# Default argument when none is provided
CMD ["."]
According to this Document, only the last ENTRYPOINT will have effect. But those 2 are in 2 different base docker images, so in any special case, will those 2 ENTRYPOINT have effect or this is just a bug?
You can keep replacing the entry point down the file, however, that's a multi-stage docker file. so if you build a given stage then you'll get a different entry point.
For example:
docker build --target rust-optimizer .
will build up to and including that stage which when run will run optimize.sh .
however
docker build .
which when run will run optimize_workspace.sh .

Dockerfile cannot find executable script (no such file or directory)

I'm writting a Dockerfile in order to create an image for a web server (a shiny server more precisely). It works well, but it depends on a huge database folder (db/) that it is not distributed with the package, so I want to do all this preprocessing while creating the image, by running the corresponding script in the Dockerfile.
I expected this to be simple, but I'm struggling figuring out where my files are being located within the image.
This repo has the following structure:
Dockerfile
preprocessing_files
configuration_files
app/
application_files
db/
processed_files
So that app/db/ does not exist, but is created and filled with files when preprocessing_files are run.
The Dockerfile is the following:
# Install R version 3.6
FROM r-base:3.6.0
# Install Ubuntu packages
RUN apt-get update && apt-get install -y \
sudo \
gdebi-core \
pandoc \
pandoc-citeproc \
libcurl4-gnutls-dev \
libcairo2-dev/unstable \
libxml2-dev \
libxt-dev \
libssl-dev
# Download and install ShinyServer (latest version)
RUN wget --no-verbose https://s3.amazonaws.com/rstudio-shiny-server-os-build/ubuntu-12.04/x86_64/VERSION -O "version.txt" && \
VERSION=$(cat version.txt) && \
wget --no-verbose "https://s3.amazonaws.com/rstudio-shiny-server-os-build/ubuntu-12.04/x86_64/shiny-server-$VERSION-amd64.deb" -O ss-latest.deb && \
gdebi -n ss-latest.deb && \
rm -f version.txt ss-latest.deb
# Install R packages that are required
RUN R -e "install.packages(c('shiny', 'flexdashboard','rmarkdown','tidyverse','plotly','DT','drc','gridExtra','fitdistrplus'), repos='http://cran.rstudio.com/')"
# Copy configuration files into the Docker image
COPY shiny-server.conf /etc/shiny-server/shiny-server.conf
COPY /app /srv/shiny-server/
COPY /app/db /srv/shiny-server/app/
# Make the ShinyApp available at port 80
EXPOSE 80
CMD ["/usr/bin/shiny-server"]
This above file works well if preprocessing_files are run in advance, so app/application_files can successfully read app/db/processed_files. How could this script be run in the Dockerfile? To me the intuitive solution would be simply to write:
RUN bash -c "preprocessing.sh"
Before the ADD instruction, but then preprocessing_files are not found. If the above instruction is written below ADD and also WORKDIR app/, the same error happens. I cannot understand why.
You cannot execute code on the host machine from Dockerfile. RUN command executes inside the container being built. You can:
Copy preprocessing_files inside docker container and run preprocessing.sh inside the container (this would increase size of the container)
Create a makefile/build.sh script which launches preprocessing.sh before executing docker build

Syntaxnet spec file and Docker?

I'm trying to learn Synatxnet. I have it running through Docker. But I really dont know much about either program Synatxnet or Docker. On the Github Sytaxnet page it says
The SyntaxNet models are configured via a combination of run-time
flags (which are easy to change) and a text format TaskSpec protocol
buffer. The spec file used in the demo is in
syntaxnet/models/parsey_mcparseface/context.pbtxt.
How exactly do I find the spec file to edit it?
I compiled SyntaxNet in a Docker container using these Instructions.
FROM java:8
ENV SYNTAXNETDIR=/opt/tensorflow PATH=$PATH:/root/bin
RUN mkdir -p $SYNTAXNETDIR \
&& cd $SYNTAXNETDIR \
&& apt-get update \
&& apt-get install git zlib1g-dev file swig python2.7 python-dev python-pip -y \
&& pip install --upgrade pip \
&& pip install -U protobuf==3.0.0b2 \
&& pip install asciitree \
&& pip install numpy \
&& wget https://github.com/bazelbuild/bazel/releases/download/0.2.2b/bazel-0.2.2b-installer-linux-x86_64.sh \
&& chmod +x bazel-0.2.2b-installer-linux-x86_64.sh \
&& ./bazel-0.2.2b-installer-linux-x86_64.sh --user \
&& git clone --recursive https://github.com/tensorflow/models.git \
&& cd $SYNTAXNETDIR/models/syntaxnet/tensorflow \
&& echo "\n\n\n" | ./configure \
&& apt-get autoremove -y \
&& apt-get clean
RUN cd $SYNTAXNETDIR/models/syntaxnet \
&& bazel test --genrule_strategy=standalone syntaxnet/... util/utf8/...
WORKDIR $SYNTAXNETDIR/models/syntaxnet
CMD [ "sh", "-c", "echo 'Bob brought the pizza to Alice.' | syntaxnet/demo.sh" ]
# COMMANDS to build and run
# ===============================
# mkdir build && cp Dockerfile build/ && cd build
# docker build -t syntaxnet .
# docker run syntaxnet
First, comment out the command line in the dockerfile, then create and cd into an empty directory on your host machine. You can then create a container from the image, mounting a directory in the container to your hard-drive:
docker run -it --rm -v /pwd:/tmp bash
You'll now have a bash session in the container. Copy the spec file into /tmp from /opt/tensorflow/syntaxnet/models/parsey_mcparseface/context.pbtxt (I'm guessing that's where it is given the info you've provided above -- I can't get your dockerfile to build an image so I can't confirm it; you can always run find . -name context.pbtxt from root to find it), and exit the container (ctrl-d or exit).
You now have the file on your host's hd ready to edit, but you really want it in a running container. If the directory it comes from contains only that file, then you can simply mount your host directory at that path in the container. If it contains other things, then you can use a, so called, bootstrap script to move the file from your mounted directory (in the example above, that's tmp) to its home location. Alternatively, you may be able to tell the software where to find the spec file with a flag, but that will take more research.

Resources