I am trying to run kubernetes on coreos. I am using fleet, setup-network-environment, and kube-register to register nodes. However, in my cloud-init file where I write my systemd unit files, the kubelet's unit file won't run this properly:
ExecStart=/opt/bin/kubelet \
--address=0.0.0.0 --port=10250 \
--hostname_override=${DEFAULT_IPV4} \
--allow_privileged=true \
--logtostderr=true \
--healthz_bind_address=0.0.0.0
Instead of my public ip, ${DEFAULT_IPV4} results in $default_ipv4, which also doesn't result in the ip. I know --host-name-override should just take a string, and it works when I run this line from command line. There are other unit files where ${ENV_VAR} works fine. Why is it that for the kubelet's unit file, it just breaks?
EDIT 1
/etc/network-environment
LO_IPV4=127.0.0.1
ENS33_IPV4=192.168.195.242
DEFAULT_IPV4=192.168.195.242
ENS34_IPV4=172.22.22.238
EDIT 2
kubelet unit file
- name: kube-kubelet.service
command: start
content: |
[Unit]
Description=Kubernetes Kubelet
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
Requires=setup-network-environment.service
After=setup-network-environment.service
[Service]
EnvironmentFile=/etc/network-environment
ExecStartPre=/usr/bin/curl -L -o /opt/bin/kubelet -z /opt/bin/kubelet https://storage.googleapis.com/kubernetes-release/release/v0.18.2/bin/linux/amd64/kubelet
ExecStartPre=/usr/bin/chmod +x /opt/bin/kubelet
# wait for kubernetes master to be up and ready
ExecStartPre=/opt/bin/wupiao 172.22.22.10 8080
ExecStart=/opt/bin/kubelet \
--address=0.0.0.0 \
--port=10250 \
--hostname_override=172.22.22.21 \
--api_servers=172.22.22.10:8080 \
--allow_privileged=true \
--logtostderr=true \
--healthz_bind_address=0.0.0.0 \
--healthz_port=10248
Restart=always
RestartSec=10
The Exec*=command is not a shell command. In my experimenting it was not very good at figuring out where the variable was unless it was by itself. I went and looked at some examples online and they always show the environment variable by itself. So, given a file like /tmp/myfile:
ENV=1.2.3.4
These [Service] definitions won't do what you think:
EnvironmentFile=/tmp/myfile
ExecStart=echo M$ENV
ExecStart=echo $ENV:8080
but, this will work on a line by itself:
EnvironmentFile=/tmp/myfile
ExecStart=echo $ENV
That doesn't help much when trying to pass an argument, like:
EnvironmentFile=/tmp/myfile
ExecStart=echo --myarg=http://$ENV:8080/v2
To accomplish passing the argument I had to put the entire myarg in a string in /tmp/myfile:
ENV="--myarg=http://1.2.3.4:8080/v2"
Finally I could can get my argument passed:
EnvironmentFile=/tmp/myfile
ExecStart=echo $ENV
It would seem the issue was in the version of coreos in the vagrant box. After an update of the vagrant box the environment variable was able to resolve to the proper value.
Related
I have a tensorflow training script which I want to run using a Docker container (based on the official TF GPU image). Although everything works just fine, running the container with the script is horribly verbose and ugly. The main problem is that my training script allows the user to specify various directories used during training, for input data, logging, generating output, etc. I don't want to have change what my users are used to, so the container needs to be informed of the location of these user-defined directories, so it can mount them. So I end up with something like this:
docker run \
-it --rm --gpus all -d \
--mount type=bind,source=/home/guest/datasets/my-dataset,target=/datasets/my-dataset \
--mount type=bind,source=/home/guest/my-scripts/config.json,target=/config.json \
-v /home/guest/my-scripts/logdir:/logdir \
-v /home/guest/my-scripts/generated:/generated \
train-image \
python train.py \
--data_dir /datasets/my-dataset \
--gpu 0 \
--logdir ./logdir \
--output ./generated \
--config_file ./config.json \
--num_epochs 250 \
--batch_size 128 \
--checkpoint_every 5 \
--generate True \
--resume False
In the above I am mounting a dataset from the host into the container, and also mounting a single config file config.json (which configures the TF model). I specify a logging directory logdir and an output directory generated as volumes. Each of these resources are also passed as parameters to the train.py script.
This is all very ugly, but I can't see another way of doing it. Of course I could put all this in a shell script, and provide command line arguments which set these duplicated values from the outside. But this doesn't seem a nice solution, because if I want to anything else with the container, for example check the logs, I would use the raw docker command.
I suspect this question will likely be tagged as opinion-based, but I've not found a good solution for this that I can recommend to my users.
As user Ron van der Heijden points out, one solution is to use docker-compose in combination with environment variables defined in an .env file. Nice answer.
I am following this link to create a spark cluster. I am able to run the spark cluster. However, I have to give an absolute path to start spark-shell. I am trying to set environment variables i.e. PATH and a few others in start-shell.sh. However, it's not setting that inside container. I tried printing it using printenv inside the container. But these variables are never reflected.
Am I trying to set environment variables incorrectly? Spark cluster is running successfully though.
I am using docker-compose.yml to build and recreate an image and container.
docker-compose up --build
Dockerfile
# builder step used to download and configure spark environment
FROM openjdk:11.0.11-jre-slim-buster as builder
# Add Dependencies for PySpark
RUN apt-get update && apt-get install -y curl vim wget software-properties-common ssh net-tools ca-certificates python3 python3-pip python3-numpy python3-matplotlib python3-scipy python3-pandas python3-simpy
# JDBC driver download and install
ADD https://go.microsoft.com/fwlink/?linkid=2168494 /usr/share/java
RUN update-alternatives --install "/usr/bin/python" "python" "$(which python3)" 1
# Fix the value of PYTHONHASHSEED
# Note: this is needed when you use Python 3.3 or greater
ENV SPARK_VERSION=3.1.2 \
HADOOP_VERSION=3.2 \
SPARK_HOME=/opt/spark \
PYTHONHASHSEED=1
# Download and uncompress spark from the apache archive
RUN wget --no-verbose -O apache-spark.tgz "https://archive.apache.org/dist/spark/spark-${SPARK_VERSION}/spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz" \
&& mkdir -p ${SPARK_HOME} \
&& tar -xf apache-spark.tgz -C ${SPARK_HOME} --strip-components=1 \
&& rm apache-spark.tgz
My Dockerfile-spark
When using SPARK_BIN="${SPARK_HOME}/bin/ under ENV in Dockerfile, environment variable get's set. It is visible inside the docker container by using printenv
FROM apache-spark
WORKDIR ${SPARK_HOME}
ENV SPARK_MASTER_PORT=7077 \
SPARK_MASTER_WEBUI_PORT=8080 \
SPARK_LOG_DIR=${SPARK_HOME}/logs \
SPARK_MASTER_LOG=${SPARK_HOME}/logs/spark-master.out \
SPARK_WORKER_LOG=${SPARK_HOME}/logs/spark-worker.out \
SPARK_WORKER_WEBUI_PORT=8080 \
SPARK_MASTER="spark://spark-master:7077" \
SPARK_WORKLOAD="master"
COPY start-spark.sh /
CMD ["/bin/bash", "/start-spark.sh"]
start-spark.sh
#!/bin/bash
. "$SPARK_HOME/bin/load-spark-env.sh"
export SPARK_BIN="${SPARK_HOME}/bin/" # This doesn't work here
export PATH="${SPARK_HOME}/bin/:${PATH}" # This doesn't work here
# When the spark work_load is master run class org.apache.spark.deploy.master.Master
if [ "$SPARK_WORKLOAD" == "master" ];
then
export SPARK_MASTER_HOST=`hostname` # This works here
cd $SPARK_BIN && ./spark-class org.apache.spark.deploy.master.Master --ip $SPARK_MASTER_HOST --port $SPARK_MASTER_PORT --webui-port $SPARK_MASTER_WEBUI_PORT >> $SPARK_MASTER_LOG.
My File structure is
dockerfile
dockerfile-spark # this uses pre-built image created by dockerfile
start-spark.sh # invoked buy dockerfile-spark
docker-compose.yml # uses build parameter to build an image from dockerfile-spark
From inside the master container
root#3abbd4508121:/opt/spark# export
declare -x HADOOP_VERSION="3.2"
declare -x HOME="/root"
declare -x HOSTNAME="3abbd4508121"
declare -x JAVA_HOME="/usr/local/openjdk-11"
declare -x JAVA_VERSION="11.0.11+9"
declare -x LANG="C.UTF-8"
declare -x OLDPWD
declare -x PATH="/usr/local/openjdk-11/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
declare -x PWD="/opt/spark"
declare -x PYTHONHASHSEED="1"
declare -x SHLVL="1"
declare -x SPARK_HOME="/opt/spark"
declare -x SPARK_LOCAL_IP="spark-master"
declare -x SPARK_LOG_DIR="/opt/spark/logs"
declare -x SPARK_MASTER="spark://spark-master:7077"
declare -x SPARK_MASTER_LOG="/opt/spark/logs/spark-master.out"
declare -x SPARK_MASTER_PORT="7077"
declare -x SPARK_MASTER_WEBUI_PORT="8080"
declare -x SPARK_VERSION="3.1.2"
declare -x SPARK_WORKER_LOG="/opt/spark/logs/spark-worker.out"
declare -x SPARK_WORKER_WEBUI_PORT="8080"
declare -x SPARK_WORKLOAD="master"
declare -x TERM="xterm"
root#3abbd4508121:/opt/spark#
There are a couple of different ways to set environment variables in Docker, and a couple of different ways to run processes. A container normally runs one process, which is controlled by the image's ENTRYPOINT and CMD settings. If you docker exec a second process in the container, that does not run as a child process of the main process, and will not see environment variables that are set by that main process.
In the setup you show here, the start-spark.sh script is the main container process (it is the image's CMD). If you docker exec your-container printenv, it will see things set in the Dockerfile but not things set in this script.
Things like filesystem paths will generally be fixed every time you run the container, no matter what command you're running there, so you can specify these in the Dockerfile
ENV SPARK_BIN=${SPARK_HOME}/bin PATH=${SPARK_BIN}:${PATH}
You can specify both an ENTRYPOINT and a CMD in your Dockerfile; if you do, the CMD is passed as arguments to the ENTRYPOINT. This leads to a useful pattern where the CMD is a standard shell command, and the ENTRYPOINT is a wrapper that does first-time setup and then runs it. You can split your script into two:
#!/bin/sh
# spark-env.sh
. "${SPARK_BIN}/load-spark-env.snh"
exec "$#"
#!/bin/sh
# start-spark.sh
spark-class org.apache.spark.deploy.master.Master \
--ip "$SPARK_MASTER_HOST" \
--port "$SPARK_MASTER_PORT" \
--webui-port "$SPARK_MASTER_WEBUI_PORT"
Then in your Dockerfile specify both parts
COPY spark-env.sh start-spark.sh /
ENTRYPOINT ["/spark-env.sh"] # must be JSON-array syntax
CMD ["/start-spark.sh"] # or any other valid CMD
This is useful for your debugging since it's straightforward to override the CMD in a docker run or docker-compose run instruction, leaving the ENTRYPOINT in place.
docker-compose run spark \
printenv
This launches a new container based on all of the same Dockerfile setup. When it runs, it runs printenv instead of the CMD in the image. This will do the first-time setup in the ENTRYPOINT script, and then the final exec "$#" line will run printenv instead of starting the Spark application. This will show you the environment the application will have when it starts.
On Windows
I successfully run Prometheus from a docker image like this.
docker run -p 9090:9090 \
-v D:/WORK/MyProject/grafana:/etc/prometheus \
prom/prometheus
The D:/WORK/MyProject/grafana contains prometheus.yml file with all configs I need.
Now I need to enable # operator usage so I added promql-at-modifier tried to run
docker run -p 9090:9090 \
-v D:/WORK/MyProject/grafana:/etc/prometheus \
prom/prometheus --enable-feature=promql-at-modifier
I got the following:
level=info ts=2021-07-30T14:56:29.139Z caller=main.go:143 msg="Experimental promql-at-modifier enabled"
level=error ts=2021-07-30T14:56:29.139Z caller=main.go:356 msg="Error loading config (--config.file=prometheus.yml)" err="open prometheus.yml: no such file or directory"
Tried to google. There are suggestions to mount file
docker run -p 9090:9090 \
-v /path/to/prometheus.yml:/etc/prometheus/prometheus.yml \
prom/prometheus
(from https://www.promlts.com/resources/wheres-my-prometheus-yml)
But no luck.
Tried to specify config file option but again no luck.
Could you help?
I am a fan of docker, but it does have a few points of friction, and you found one of them.
https://github.com/prometheus/prometheus/blob/main/Dockerfile#L25 is where the upstream prometheus defines ENTRYPOINT and CMD:
ENTRYPOINT [ "/bin/prometheus" ]
CMD [ "--config.file=/etc/prometheus/prometheus.yml", \
"--storage.tsdb.path=/prometheus", \
"--web.console.libraries=/usr/share/prometheus/console_libraries", \
"--web.console.templates=/usr/share/prometheus/consoles" ]
The problem is, any arguments provided to the docker run command will replace the default CMD. So in order to append arguments to the default CMD, you sadly need to copy the upstream CMD and then add your argument to the list.
Sadly, docker does not (currently!) support any way to "append" something to an upstream's CMD. How to append an argument to a container command? gives one idea for using an environment variable to do it.
In the general case where I want to provide default arguments and allow the invocation to provide additional arguments, I usually follow this pattern:
Make the entrypoint launch a shell script
exec the real entrypoint at the end of the shell script. exec replaces the shell with the real entrypoint, so that exec is important so signals are passed to the entrypoint and not the wrapper shell script.
At the end of the arguments to exec within the script, add "$#", which expands to the arguments of the shell script, quoted appropriately (yes, shell is quite esoteric! you'd think it would quote all the arguments together, but instead it quotes each of the arguments because that token is magical)
In this way, the "default" commands are within the shell script and thus don't need to be included with CMD. The downside to this method is that the shell script provided arguments are more difficult to remove if you wanted to.
Here's an example:
https://github.com/farrellit/stackoverflow/tree/main/68593213
The dockerfile includes a default CMD:
FROM alpine
COPY entrypoint.sh /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]
CMD ["7"]
the entrypoint.sh includes a set of "automatic" arguments to which is appended CMD, either default or overridden.
#!/bin/sh
exec echo 1 2 3 "$#"
The Makefile demonstrates to two invocations:
docker run --rm stackoverflow-68593213
docker run --rm stackoverflow-68593213 4 5 6
docker run --rm stackoverflow-68593213
1 2 3 7
docker run --rm stackoverflow-68593213 4 5 6
1 2 3 4 5 6
Here, 1 2 3 are the default "base" paramters I always want to pass to the ENTRYPOINT, 7 is the default "additional" parameters, and 4 5 6 provided to override the default parameters.
Can you try adding:
--config.file=/etc/prometheus/prometheus.yml
i.e.
docker run --publish=9090:9090 \
--volume=D:/WORK/MyProject/grafana:/etc/prometheus \
prom/prometheus \
--config.file=/etc/prometheus/prometheus.yml \
--enable-feature=promql-at-modifier
Explanation: Once you add flags (e.g. --enable-feature), other flags take default values. The default value for --config.file is prometheus.yml which is not what you want (you want /etc/prometheus/prometheus.yml) and so you must explicitly reference it.
Just a few brief details that lie behind DazWilkin's answer:
If you docker inspect the prom/prometheus image, you'll find the
following:
"Entrypoint": [
"/bin/prometheus"
],
"Cmd": [
"--config.file=/etc/prometheus/prometheus.yml",
"--storage.tsdb.path=/prometheus",
"--web.console.libraries=/usr/share/prometheus/console_libraries",
"--web.console.templates=/usr/share/prometheus/consoles"
],
When you run:
docker run ... prom/prometheus --enable-feature=promql-at-modifier
You are replacing the existing Cmd setting, so the command actually
executed is /bin/prometheus --enable-feature=promql-at-modifier. To
provide the same behavior as you get by default, you would actually
want to run:
docker run ... prom/prometheus \
--enable-feature=promql-at-modifier \
--config.file=/etc/prometheus/prometheus.yml \
--storage.tsdb.path=/prometheus \
--web.console.libraries=/usr/share/prometheus/console_libraries \
--web.console.templates=/usr/share/prometheus/consoles
I'm trying to run a docker container from snakemake. The jobs run and produce the correct output but when they complete I get
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
and snakemake tries to remove the files.
Things I've tried to debug this:
Adding || true, set +u or exit 0 to the shell command in the rule
Running the commands shown in --printshellcmds (they run fine)
Putting the docker command in a bash script in 'strict' mode to see
if there are any issues (there are not)
Printing the actual exit
code in the shell: command of the rule and it prints 0
Running the R script which is being called in the container directly
from the snakemake (it works fine)
Adding --user $(id -u):$(id -g) to the docker run command suggested here: Snakemake claims rule exits with non-zero exit code, even with "|| true"?. This
fails as the R script in the container has nowhere to write out
intermediate files as my current user does not properly exist in the container
Here is the snakefile rule:
rule run_biospyder:
input:
counts = "{dir}/{name}_counts.csv",
metadata = "{dir}/{name}_metadata.csv",
output: directory("{dir}/{name}_output")
params:
dockervol = "/usr/src/data"
shell:
"""
docker run \
-v $PWD:{params.dockervol} biospyderpipeline \
--config-file {params.dockervol}/config.yml \
--counts {params.dockervol}/{input.counts} \
--samples {params.dockervol}/{input.metadata} \
--output {params.dockervol}/{output} \
--name {wildcards.name}
"""
The issue is a docker permissions issue when mapping volumes. The problem seems to be due to inside container being run as root (I'm just developing) and snakemake tries to write out .snakemake_timestamp as my current user to output directory of the mapped volume (which is owned by root).
I managed to fix it with this: https://denibertovic.com/posts/handling-permissions-with-docker-volumes/
The shell script at the entrypoint creates a user with the same id and runs any commands as that user. This means the mapped volume contents are owned by by user and Snakemake can write to it at the end of the run. This also doesn't fix the user id at the docker build so anyone can use it.
I want to set $PS1 environment variable to the container. It helps me to identify multilevel or complex docker environment setup. Currently docker container prompts with:
root#container-id#
If I can change it as following , I can identify the container by looking at the $PS1 prompt itself.
[Level-1]root#container-id#
I did experiments by exporting $PS1 by making my own image (Dockerfile), .profile file etc. But it's not reflecting.
I had the same problem but in docker-compose context.
Here is how I managed to make it work:
# docker-compose.yml
version: '3'
services:
my_service:
image: my/image
environment:
- "PS1=$$(whoami):$$(pwd) $$ "
Just pass PS1 value as an environment variable in docker-compose.yml configuration file.
Notice how dollars signs need to be escaped to prevent docker-compose from interpolating values (documentation).
This Dockerfile sets PS1 by doing:
RUN echo 'export PS1="[\u#docker] \W # "' >> /root/.bash_profile
We use a similar technique for tracking inputs and outputs in complex container builds.
https://github.com/ianmiell/shutit/blob/master/shutit_global.py#L1338
This line represents the product of hard-won experience dealing with docker/(p)expect combinations:
"SHUTIT_BACKUP_PS1_%s=$PS1 && PS1='%s' && unset PROMPT_COMMAND"
Backing up the prompt is handy if you want to revert, setting the PS1 with PS1= sets the PS1, and unsetting the PROMPT_COMMAND removes any nasty surprises with the terminal being reset etc.. for the expect.
If the question is about how to ensure it's set when you run the container up (as opposed to building), then you may need to add something to your .bashrc / .profile files depending on how you run up your container. As far as I know there's no way to ensure it with a dockerfile directive and make it persist.
I normally create /home/USER/.bashrc or /root/.bashrc, depending on who the USER of the Dockerfile is. That works well. I've tried
ENV PS1 '# '
but that never worked for me.
Here's a way to set the PS1 when you run the container:
docker run -it \
python:latest \
bash -c "echo \"export PS1='[python:latest] \w$ '\" >> ~/.bashrc && bash"
I made a little wrapper script, to be able to run any image with my custom prompt:
#!/usr/bin/env bash
# ~/bin/docker-run
set -eu
image=$1
docker run -it \
-v $(pwd):/opt/app
-w /opt/app ${image} \
bash -c "echo \"export PS1='[${image}] \w$ '\" >> ~/.bashrc && bash"
In debian 9, for running bash, this worked:
RUN echo 'export PS1="[\$ENV_VAR] \W # "' >> /root/.bashrc
It's generally running as root and I generally know I am in docker, so I wanted to have a prompt that indicated what the container was, so I used an environment variable. And I guess the bash I use loads .bashrc preferentially.
Try setting environment variables using docker options
Example:
docker run \
-ti \
--rm \
--name ansibleserver-debug \
-w /githome/axel-ansible/ \
-v /home/lordjea/githome/:/githome/ \
-e "PS1=DEBUG$(pwd)# " \
lordjea/priv:311 bash
docker --help
Usage: docker run [OPTIONS] IMAGE [COMMAND] [ARG...]
Run a command in a new container
Options:
...
-e, --env list Set environment variables
...
You should set that in .profile, not .bashrc.
Just open .profile from your root or home and replace PS1='\u#\h:\w\$ ' with PS1='\e[33;1m\u#\h: \e[31m\W\e[0m\$ ' or whatever you want.
Note that you need to restart your container.
On my MAC I have an alias named lxsh that will start a bash shell using the ubuntu image in my current directory (details). To make the shell's prompt change, I mounted a host file onto /root/.bash_aliases. It's a dirty hack, but it works. The full alias:
alias lxsh='echo "export PS1=\"lxsh[\[$(tput bold)\]\t\[$(tput sgr0)\]\w]\\$\[$(tput sgr0)\] \"" > $TMPDIR/a5ad217e-0f2b-471e-a9f0-a49c4ae73668 && docker run --rm --name lxsh -v $TMPDIR/a5ad217e-0f2b-471e-a9f0-a49c4ae73668:/root/.bash_aliases -v $PWD:$PWD -w $PWD -it ubuntu'
The below solution assumes that you've used Dockerfile USER to set a non-root Linux user for Bash.
What you might have tried without success:
ENV PS1='[docker]$' ## may not work
Using ENV to set PS1 can fail because the value can be overridden by default settings in a preexisting .bashrc when an interactive shell is started. Some Linux distributions are opinionated about PS1 and set it in an initial .bashrc for each user (Ubuntu does this, for example).
The fix is to modify the Dockerfile to set the desired value at the end of the user's .bashrc -- overriding any earlier settings in the script.
FROM ubuntu:20.04
# ...
USER myuser ## the username
RUN echo "PS1='\n[ \u#docker \w ]\n$ '" >>.bashrc