Syntaxnet spec file and Docker? - docker

I'm trying to learn Synatxnet. I have it running through Docker. But I really dont know much about either program Synatxnet or Docker. On the Github Sytaxnet page it says
The SyntaxNet models are configured via a combination of run-time
flags (which are easy to change) and a text format TaskSpec protocol
buffer. The spec file used in the demo is in
syntaxnet/models/parsey_mcparseface/context.pbtxt.
How exactly do I find the spec file to edit it?
I compiled SyntaxNet in a Docker container using these Instructions.
FROM java:8
ENV SYNTAXNETDIR=/opt/tensorflow PATH=$PATH:/root/bin
RUN mkdir -p $SYNTAXNETDIR \
&& cd $SYNTAXNETDIR \
&& apt-get update \
&& apt-get install git zlib1g-dev file swig python2.7 python-dev python-pip -y \
&& pip install --upgrade pip \
&& pip install -U protobuf==3.0.0b2 \
&& pip install asciitree \
&& pip install numpy \
&& wget https://github.com/bazelbuild/bazel/releases/download/0.2.2b/bazel-0.2.2b-installer-linux-x86_64.sh \
&& chmod +x bazel-0.2.2b-installer-linux-x86_64.sh \
&& ./bazel-0.2.2b-installer-linux-x86_64.sh --user \
&& git clone --recursive https://github.com/tensorflow/models.git \
&& cd $SYNTAXNETDIR/models/syntaxnet/tensorflow \
&& echo "\n\n\n" | ./configure \
&& apt-get autoremove -y \
&& apt-get clean
RUN cd $SYNTAXNETDIR/models/syntaxnet \
&& bazel test --genrule_strategy=standalone syntaxnet/... util/utf8/...
WORKDIR $SYNTAXNETDIR/models/syntaxnet
CMD [ "sh", "-c", "echo 'Bob brought the pizza to Alice.' | syntaxnet/demo.sh" ]
# COMMANDS to build and run
# ===============================
# mkdir build && cp Dockerfile build/ && cd build
# docker build -t syntaxnet .
# docker run syntaxnet

First, comment out the command line in the dockerfile, then create and cd into an empty directory on your host machine. You can then create a container from the image, mounting a directory in the container to your hard-drive:
docker run -it --rm -v /pwd:/tmp bash
You'll now have a bash session in the container. Copy the spec file into /tmp from /opt/tensorflow/syntaxnet/models/parsey_mcparseface/context.pbtxt (I'm guessing that's where it is given the info you've provided above -- I can't get your dockerfile to build an image so I can't confirm it; you can always run find . -name context.pbtxt from root to find it), and exit the container (ctrl-d or exit).
You now have the file on your host's hd ready to edit, but you really want it in a running container. If the directory it comes from contains only that file, then you can simply mount your host directory at that path in the container. If it contains other things, then you can use a, so called, bootstrap script to move the file from your mounted directory (in the example above, that's tmp) to its home location. Alternatively, you may be able to tell the software where to find the spec file with a flag, but that will take more research.

Related

How does this Dockerfile actually run logstash without an entrypoint or cmd?

Just doing a container start on this official logstash docker container does make logstash properly run, given the right config.
It does not have an entrypoint or cmd, or anything of the sort though. I am also not issuing one on the start command. So, how is logstash actually getting executed in this case?
I need to know because I need to edit the command for other reasons. We're working on running it in kubernetes but are just testing with local docker for now.
https://github.com/elastic/logstash/blob/7.15/Dockerfile
Copied for easy reference:
FROM ubuntu:bionic
RUN apt-get update && \
apt-get install -y zlib1g-dev build-essential vim rake git curl libssl-dev libreadline-dev libyaml-dev \
libxml2-dev libxslt-dev openjdk-11-jdk-headless curl iputils-ping netcat && \
apt-get clean
WORKDIR /root
RUN adduser --disabled-password --gecos "" --home /home/logstash logstash && \
mkdir -p /usr/local/share/ruby-build && \
mkdir -p /opt/logstash && \
mkdir -p /opt/logstash/data && \
mkdir -p /mnt/host && \
chown logstash:logstash /opt/logstash
USER logstash
WORKDIR /home/logstash
# used by the purge policy
LABEL retention="keep"
# Setup gradle wrapper. When running any `gradle` command, a `settings.gradle` is expected (and will soon be required).
# This section adds the gradle wrapper, `settings.gradle` and sets the permissions (setting the user to root for `chown`
# and working directory to allow this and then reverts back to the previous working directory and user.
COPY --chown=logstash:logstash gradlew /opt/logstash/gradlew
COPY --chown=logstash:logstash gradle/wrapper /opt/logstash/gradle/wrapper
COPY --chown=logstash:logstash settings.gradle /opt/logstash/settings.gradle
WORKDIR /opt/logstash
RUN for iter in `seq 1 10`; do ./gradlew wrapper --warning-mode all && exit_code=0 && break || exit_code=$? && echo "gradlew error: retry $iter in 10s" && sleep 10; done; exit $exit_code
WORKDIR /home/logstash
ADD versions.yml /opt/logstash/versions.yml
ADD LICENSE.txt /opt/logstash/LICENSE.txt
ADD NOTICE.TXT /opt/logstash/NOTICE.TXT
ADD licenses /opt/logstash/licenses
ADD CONTRIBUTORS /opt/logstash/CONTRIBUTORS
ADD Gemfile.template Gemfile.jruby-2.5.lock.* /opt/logstash/
ADD Rakefile /opt/logstash/Rakefile
ADD build.gradle /opt/logstash/build.gradle
ADD rubyUtils.gradle /opt/logstash/rubyUtils.gradle
ADD rakelib /opt/logstash/rakelib
ADD config /opt/logstash/config
ADD spec /opt/logstash/spec
ADD qa /opt/logstash/qa
ADD lib /opt/logstash/lib
ADD pkg /opt/logstash/pkg
ADD tools /opt/logstash/tools
ADD logstash-core /opt/logstash/logstash-core
ADD logstash-core-plugin-api /opt/logstash/logstash-core-plugin-api
ADD bin /opt/logstash/bin
ADD modules /opt/logstash/modules
ADD x-pack /opt/logstash/x-pack
ADD ci /opt/logstash/ci
USER root
RUN rm -rf build && \
mkdir -p build && \
chown -R logstash:logstash /opt/logstash
USER logstash
WORKDIR /opt/logstash
LABEL retention="prune"
If you look at the final layer on the image here, it looks like there is an ENTRYPOINT ["/usr/local/bin/docker-entrypoint"]. The Dockerfile you've linked might not be the one used to build the image.

Micromamba inside Docker container

I have a base Docker image:
FROM ubuntu:21.04
WORKDIR /app
RUN apt-get update && apt-get install -y wget bzip2 \
&& wget -qO- https://micromamba.snakepit.net/api/micromamba/linux-64/latest | tar -xvj bin/micromamba \
&& touch /root/.bashrc \
&& ./bin/micromamba shell init -s bash -p /opt/conda \
&& cp /root/.bashrc /opt/conda/bashrc \
&& apt-get clean autoremove --yes \
&& rm -rf /var/lib/{apt,dpkg,cache,log}
SHELL ["bash", "-l" ,"-c"]
and derive from it another one:
ARG BASE
FROM $BASE
RUN source /opt/conda/bashrc && micromamba activate \
&& micromamba create --file environment.yaml -p /env
While building the second image I get the following error: micromamba: command not found for the RUN section.
If I run 1st base image manually I can launch micromamba, it is running correctly
I can run temporary image which were created for 2nd image building, micromamba is available via CLI, running correctly.
If I inherit from debian:buster, or alpine, for example, it is building perfectly.
What a problem with the Ubuntu? Why it cannot see micromamba during 2nd Docker image building?
PS using scaffold for building, so it can understand correctly, where is $BASE and what is it.
The ubuntu:21.04 image comes with a /root/.bashrc file that begins with:
# ~/.bashrc: executed by bash(1) for non-login shells.
# see /usr/share/doc/bash/examples/startup-files (in the package bash-doc)
# for examples
# If not running interactively, don't do anything
[ -z "$PS1" ] && return
When the second Dockerfile executes RUN source /opt/conda/bashrc, PS1 is not set and thus the remainder of the bashrc file does not execute. The remainder of the bashrc file is where micromamba initialization occurs, including the setup of the micromamba bash function that is used to activate a micromamba environment.
The debian:buster image has a smaller /root/.bashrc that does not have a line similar to [ -z "$PS1" ] && return and therefore the micromamba function gets loaded.
The alpine image does not come with a /root/.bashrc so it also does not contain the code to exit the file early.
If you want to use the ubuntu:21.04 image, you could modify you first Dockerfile like this:
FROM ubuntu:21.04
WORKDIR /app
RUN apt-get update && apt-get install -y wget bzip2 \
&& wget -qO- https://micromamba.snakepit.net/api/micromamba/linux-64/latest | tar -xvj bin/micromamba \
&& touch /root/.bashrc \
&& ./bin/micromamba shell init -s bash -p /opt/conda \
&& grep -v '[ -z "\$PS1" ] && return' /root/.bashrc > /opt/conda/bashrc # this line has been modified \
&& apt-get clean autoremove --yes \
&& rm -rf /var/lib/{apt,dpkg,cache,log}
SHELL ["bash", "-l" ,"-c"]
This will strip out the one line that causes the early termination.
Alternatively, you could make use of the existing mambaorg/micromamba docker image. The mambaorg/micromamba:latest is based on debian:slim, but mambaorg/micromamba:jammy will get you a ubuntu-based image (disclosure: I maintain this image).

Copy folder from Dockerfile to host [duplicate]

This question already has answers here:
Docker: Copying files from Docker container to host
(27 answers)
Closed 1 year ago.
I have a docker file
FROM ubuntu:20.04
################################
### INSTALL Ubuntu build tools and prerequisites
################################
# Install build base
ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y \
build-essential \
git \
subversion \
sharutils \
vim \
asciidoc \
binutils \
bison \
flex \
texinfo \
gawk \
help2man \
intltool \
libelf-dev \
zlib1g-dev \
libncurses5-dev \
ncurses-term \
libssl-dev \
python2.7-dev \
unzip \
wget \
rsync \
gettext \
xsltproc && \
apt-get clean && rm -rf /var/lib/apt/lists/*
ARG FORCE_UNSAFE_CONFIGURE=1
RUN git clone https://git.openwrt.org/openwrt/openwrt.git
WORKDIR /openwrt
RUN ./scripts/feeds update -a && ./scripts/feeds install -a
COPY .config /openwrt/.config
RUN mkdir files
WORKDIR /files
RUN mkdir etc
WORKDIR /etc
RUN mkdir uci-defaults
WORKDIR /uci-defaults
COPY xx_custom /openwrt/files/etc/uci-defaults/xx_custom
WORKDIR /openwrt
RUN make -j 4
RUN ls /openwrt/bin/targets/ramips/mt76x8
WORKDIR /root
CMD ["bash"]
I want to copy all the files inside the folder mt76x8 to the host. I want to that inside the dockerfile so that when I run the docker file I should get the generated files in my host.
How can I do that?
you can use the volume mount to access the docker-generated artifacts on the host machine.
you can also run the command
docker cp to copy the files to the host machine.
if don't want to use the docker command as mention only option is to use the volume.
you can also use docker create once the docker image is ready to create the writable layer and copy data.
You have two choices.
Use docker volumes to map the /openwrt/bin/targets/ramips/mt76x8 folder when you are running the container. i.e. docker run -v {VoluneName}:/openwrt/bin/targets/ramips/mt76x8. All of the files in the mt76x8 folder would be available in the volume folder. If you are using Linux then you will find the docker volumes in /var/lib/docker/volumes/
You can use docker cp command to copy data from container to the host machine. Here is an example

"env" parameter not applied in container

I'm just testing out Docker so this might be a pretty simple question but I cannot seem to find out why it's not doing what I expect.
I created a pretty simple Dockerfile for testing, just to build a simple image that installs some packages, clones a git repo and build its requirements:
FROM ubuntu:18.04
ENV PYTHONEXEC=python3 \
PIPEXEC=pip \
VIRTUALENVEXEC=virtualenv \
GITREPO=https://github.com/test/test.git \
REPODIR=test
RUN apt-get update && apt-get install -y git \
python3 \
python3-dev \
python3-virtualenv \
python-virtualenv \
qt5-default \
libcurl4-openssl-dev \
libxml2 \
libxml2-dev \
libxslt1-dev \
libssl-dev \
virt-viewer
RUN mkdir -p /app
WORKDIR /app
RUN git clone $GITREPO $REPODIR \
&& $VIRTUALENVEXEC -p $PYTHONEXEC venv \
&& . venv/bin/activate \
&& cd $REPODIR \
&& $PIPEXEC install -r requirements.txt
CMD ["sleep", "1000000"]
Then I build the image with:
docker build -t gitapp:latest .
This works so far. However, if I specify a -e parameter on the docker container run command, it seems not to replace it in the last RUN command.
So if I run docker container run -d -e "REPODIR=blah" gitapp, I expect it to be cloned in /app/blah, but it's still cloned in the /app/test directory.
When I run a docker container exec -it -e "REPODIR=blah" <container-id> env I get:
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=2f6ba38341d6
TERM=xterm
REPODIR=blah
PYTHONEXEC=python3
PIPEXEC=pip
VIRTUALENVEXEC=virtualenv
GITREPO=https://github.com/test/test.git
HOME=/root
So it seems that the variable is indeed passed to the container. Then why it isn't passed to the last RUN command so it clones the repo in the right directory? Am I missing something basic here?
When you execute a docker run you are instructing a container to execute Dockerfile's CMD or ENTRYPOINT command. Dockerfile commands that are above entrypoint have been already executed during build and are not executing again at runtime.
That's exactly the reason your github repo is being cloned to the directory defined initially at the Dockerfile and not in the one passed at the run command with -e flag.
A workaround would be to alter your image's entrypoint. You may transfer this part
RUN git clone $GITREPO $REPODIR \
&& $VIRTUALENVEXEC -p $PYTHONEXEC venv \
&& . venv/bin/activate \
&& cd $REPODIR \
&& $PIPEXEC install -r requirements.txt
to a bash script(let's call it my.script.sh) file that will be executed as image's entrypoint. Copy this file during build process in a preferred location, ensuring it holds executable flag and edit your Dockerfile's entrypoint accordingly:
CMD ["/path_to_script/myscript.sh" ]
This however has the caveat that the script will be executed each time the container is started in contrast with your current setup, possibly leading to delay depending on myscript.sh content.

Dockerfile cannot find executable script (no such file or directory)

I'm writting a Dockerfile in order to create an image for a web server (a shiny server more precisely). It works well, but it depends on a huge database folder (db/) that it is not distributed with the package, so I want to do all this preprocessing while creating the image, by running the corresponding script in the Dockerfile.
I expected this to be simple, but I'm struggling figuring out where my files are being located within the image.
This repo has the following structure:
Dockerfile
preprocessing_files
configuration_files
app/
application_files
db/
processed_files
So that app/db/ does not exist, but is created and filled with files when preprocessing_files are run.
The Dockerfile is the following:
# Install R version 3.6
FROM r-base:3.6.0
# Install Ubuntu packages
RUN apt-get update && apt-get install -y \
sudo \
gdebi-core \
pandoc \
pandoc-citeproc \
libcurl4-gnutls-dev \
libcairo2-dev/unstable \
libxml2-dev \
libxt-dev \
libssl-dev
# Download and install ShinyServer (latest version)
RUN wget --no-verbose https://s3.amazonaws.com/rstudio-shiny-server-os-build/ubuntu-12.04/x86_64/VERSION -O "version.txt" && \
VERSION=$(cat version.txt) && \
wget --no-verbose "https://s3.amazonaws.com/rstudio-shiny-server-os-build/ubuntu-12.04/x86_64/shiny-server-$VERSION-amd64.deb" -O ss-latest.deb && \
gdebi -n ss-latest.deb && \
rm -f version.txt ss-latest.deb
# Install R packages that are required
RUN R -e "install.packages(c('shiny', 'flexdashboard','rmarkdown','tidyverse','plotly','DT','drc','gridExtra','fitdistrplus'), repos='http://cran.rstudio.com/')"
# Copy configuration files into the Docker image
COPY shiny-server.conf /etc/shiny-server/shiny-server.conf
COPY /app /srv/shiny-server/
COPY /app/db /srv/shiny-server/app/
# Make the ShinyApp available at port 80
EXPOSE 80
CMD ["/usr/bin/shiny-server"]
This above file works well if preprocessing_files are run in advance, so app/application_files can successfully read app/db/processed_files. How could this script be run in the Dockerfile? To me the intuitive solution would be simply to write:
RUN bash -c "preprocessing.sh"
Before the ADD instruction, but then preprocessing_files are not found. If the above instruction is written below ADD and also WORKDIR app/, the same error happens. I cannot understand why.
You cannot execute code on the host machine from Dockerfile. RUN command executes inside the container being built. You can:
Copy preprocessing_files inside docker container and run preprocessing.sh inside the container (this would increase size of the container)
Create a makefile/build.sh script which launches preprocessing.sh before executing docker build

Resources