OpenFaaS serve model using Tensorflow serving - docker

I'd like to serve Tensorfow Model by using OpenFaaS. Basically, I'd like to invoke the "serve" function in such a way that tensorflow serving is going to expose my model.
OpenFaaS is running correctly on Kubernetes and I am able to invoke functions via curl or from the UI.
I used the incubator-flask as example, but I keep receiving 502 Bad Gateway all the time.
The OpenFaaS project looks like the following
serve/
- Dockerfile
stack.yaml
The inner Dockerfile is the following
FROM tensorflow/serving
RUN mkdir -p /home/app
RUN apt-get update \
&& apt-get install curl -yy
RUN echo "Pulling watchdog binary from Github." \
&& curl -sSLf https://github.com/openfaas-incubator/of-watchdog/releases/download/0.4.6/of-watchdog > /usr/bin/fwatchdog \
&& chmod +x /usr/bin/fwatchdog
WORKDIR /root/
# remove unecessery logs from S3
ENV TF_CPP_MIN_LOG_LEVEL=3
ENV AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
ENV AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
ENV AWS_REGION=${AWS_REGION}
ENV S3_ENDPOINT=${S3_ENDPOINT}
ENV fprocess="tensorflow_model_server --rest_api_port=8501 \
--model_name=${MODEL_NAME} \
--model_base_path=${MODEL_BASE_PATH}"
# Set to true to see request in function logs
ENV write_debug="true"
ENV cgi_headers="true"
ENV mode="http"
ENV upstream_url="http://127.0.0.1:8501"
# gRPC tensorflow serving
# EXPOSE 8500
# REST tensorflow serving
# EXPOSE 8501
RUN touch /tmp/.lock
HEALTHCHECK --interval=5s CMD [ -e /tmp/.lock ] || exit 1
CMD [ "fwatchdog" ]
the stack.yaml file looks like the following
provider:
name: faas
gateway: https://gateway-url:8080
functions:
serve:
lang: dockerfile
handler: ./serve
image: repo/serve-model:latest
imagePullPolicy: always
I build the image with faas-cli build -f stack.yaml and then I push it to my docker registry with faas-cli push -f stack.yaml.
When I execute faas-cli deploy -f stack.yaml -e AWS_ACCESS_KEY_ID=... I get Accepted 202 and it appears correctly among my functions. Now, I want to invoke the tensorflow serving on the model I specified in my ENV.
The way I try to make it work is to use curl in this way
curl -d '{"inputs": ["1.0", "2.0", "5.0"]}' -X POST https://gateway-url:8080/function/deploy-model/v1/models/mnist:predict
but I always obtain 502 Bad Gateway.
Does anybody have experience with OpenFaaS and Tensorflow Serving? Thanks in advance
P.S.
If I run tensorflow serving without of-watchdog (basically without the openfaas stuff), the model is served correctly.

Elaborating the link mentioned by #viveksyngh.
tensorflow-serving-openfaas:
Example of packaging TensorFlow Serving with OpenFaaS to be deployed and managed through OpenFaaS with auto-scaling, scale-from-zero and a sane configuration for Kubernetes.
This example was adapted from: https://www.tensorflow.org/serving
Pre-reqs:
OpenFaaS
OpenFaaS CLI
Docker
Instructions:
Clone the repo
$ mkdir -p ~/dev/
$ cd ~/dev/
$ git clone https://github.com/alexellis/tensorflow-serving-openfaas
Clone the sample model and copy it to the function's build context
$ cd ~/dev/tensorflow-serving-openfaas
$ git clone https://github.com/tensorflow/serving
$ cp -r serving/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_two_cpu ./ts-serve/saved_model_half_plus_two_cpu
Edit the Docker Hub username
You need to edit the stack.yml file and replace alexellis2 with your Docker Hub account.
Build the function image
$ faas-cli build
You should now have a Docker image in your local library which you can deploy to a cluster with faas-cli up
Test the function locally
All OpenFaaS images can be run stand-alone without OpenFaaS installed, let's do a quick test, but replace alexellis2 with your own name.
$ docker run -p 8081:8080 -ti alexellis2/ts-serve:latest
Now in another terminal:
$ curl -d '{"instances": [1.0, 2.0, 5.0]}' \
-X POST http://127.0.0.1:8081/v1/models/half_plus_two:predict
{
"predictions": [2.5, 3.0, 4.5
]
}
From here you can run faas-cli up and then invoke your function from the OpenFaaS UI, CLI or REST API.
$ export OPENFAAS_URL=http://127.0.0.1:8080
$ curl -d '{"instances": [1.0, 2.0, 5.0]}' $OPENFAAS_URL/function/ts-serve/v1/models/half_plus_two:predict
{
"predictions": [2.5, 3.0, 4.5
]
}

Related

Unable to set environment variable inside docker container when calling sh file from Dockerfile CMD

I am following this link to create a spark cluster. I am able to run the spark cluster. However, I have to give an absolute path to start spark-shell. I am trying to set environment variables i.e. PATH and a few others in start-shell.sh. However, it's not setting that inside container. I tried printing it using printenv inside the container. But these variables are never reflected.
Am I trying to set environment variables incorrectly? Spark cluster is running successfully though.
I am using docker-compose.yml to build and recreate an image and container.
docker-compose up --build
Dockerfile
# builder step used to download and configure spark environment
FROM openjdk:11.0.11-jre-slim-buster as builder
# Add Dependencies for PySpark
RUN apt-get update && apt-get install -y curl vim wget software-properties-common ssh net-tools ca-certificates python3 python3-pip python3-numpy python3-matplotlib python3-scipy python3-pandas python3-simpy
# JDBC driver download and install
ADD https://go.microsoft.com/fwlink/?linkid=2168494 /usr/share/java
RUN update-alternatives --install "/usr/bin/python" "python" "$(which python3)" 1
# Fix the value of PYTHONHASHSEED
# Note: this is needed when you use Python 3.3 or greater
ENV SPARK_VERSION=3.1.2 \
HADOOP_VERSION=3.2 \
SPARK_HOME=/opt/spark \
PYTHONHASHSEED=1
# Download and uncompress spark from the apache archive
RUN wget --no-verbose -O apache-spark.tgz "https://archive.apache.org/dist/spark/spark-${SPARK_VERSION}/spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz" \
&& mkdir -p ${SPARK_HOME} \
&& tar -xf apache-spark.tgz -C ${SPARK_HOME} --strip-components=1 \
&& rm apache-spark.tgz
My Dockerfile-spark
When using SPARK_BIN="${SPARK_HOME}/bin/ under ENV in Dockerfile, environment variable get's set. It is visible inside the docker container by using printenv
FROM apache-spark
WORKDIR ${SPARK_HOME}
ENV SPARK_MASTER_PORT=7077 \
SPARK_MASTER_WEBUI_PORT=8080 \
SPARK_LOG_DIR=${SPARK_HOME}/logs \
SPARK_MASTER_LOG=${SPARK_HOME}/logs/spark-master.out \
SPARK_WORKER_LOG=${SPARK_HOME}/logs/spark-worker.out \
SPARK_WORKER_WEBUI_PORT=8080 \
SPARK_MASTER="spark://spark-master:7077" \
SPARK_WORKLOAD="master"
COPY start-spark.sh /
CMD ["/bin/bash", "/start-spark.sh"]
start-spark.sh
#!/bin/bash
. "$SPARK_HOME/bin/load-spark-env.sh"
export SPARK_BIN="${SPARK_HOME}/bin/" # This doesn't work here
export PATH="${SPARK_HOME}/bin/:${PATH}" # This doesn't work here
# When the spark work_load is master run class org.apache.spark.deploy.master.Master
if [ "$SPARK_WORKLOAD" == "master" ];
then
export SPARK_MASTER_HOST=`hostname` # This works here
cd $SPARK_BIN && ./spark-class org.apache.spark.deploy.master.Master --ip $SPARK_MASTER_HOST --port $SPARK_MASTER_PORT --webui-port $SPARK_MASTER_WEBUI_PORT >> $SPARK_MASTER_LOG.
My File structure is
dockerfile
dockerfile-spark # this uses pre-built image created by dockerfile
start-spark.sh # invoked buy dockerfile-spark
docker-compose.yml # uses build parameter to build an image from dockerfile-spark
From inside the master container
root#3abbd4508121:/opt/spark# export
declare -x HADOOP_VERSION="3.2"
declare -x HOME="/root"
declare -x HOSTNAME="3abbd4508121"
declare -x JAVA_HOME="/usr/local/openjdk-11"
declare -x JAVA_VERSION="11.0.11+9"
declare -x LANG="C.UTF-8"
declare -x OLDPWD
declare -x PATH="/usr/local/openjdk-11/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
declare -x PWD="/opt/spark"
declare -x PYTHONHASHSEED="1"
declare -x SHLVL="1"
declare -x SPARK_HOME="/opt/spark"
declare -x SPARK_LOCAL_IP="spark-master"
declare -x SPARK_LOG_DIR="/opt/spark/logs"
declare -x SPARK_MASTER="spark://spark-master:7077"
declare -x SPARK_MASTER_LOG="/opt/spark/logs/spark-master.out"
declare -x SPARK_MASTER_PORT="7077"
declare -x SPARK_MASTER_WEBUI_PORT="8080"
declare -x SPARK_VERSION="3.1.2"
declare -x SPARK_WORKER_LOG="/opt/spark/logs/spark-worker.out"
declare -x SPARK_WORKER_WEBUI_PORT="8080"
declare -x SPARK_WORKLOAD="master"
declare -x TERM="xterm"
root#3abbd4508121:/opt/spark#
There are a couple of different ways to set environment variables in Docker, and a couple of different ways to run processes. A container normally runs one process, which is controlled by the image's ENTRYPOINT and CMD settings. If you docker exec a second process in the container, that does not run as a child process of the main process, and will not see environment variables that are set by that main process.
In the setup you show here, the start-spark.sh script is the main container process (it is the image's CMD). If you docker exec your-container printenv, it will see things set in the Dockerfile but not things set in this script.
Things like filesystem paths will generally be fixed every time you run the container, no matter what command you're running there, so you can specify these in the Dockerfile
ENV SPARK_BIN=${SPARK_HOME}/bin PATH=${SPARK_BIN}:${PATH}
You can specify both an ENTRYPOINT and a CMD in your Dockerfile; if you do, the CMD is passed as arguments to the ENTRYPOINT. This leads to a useful pattern where the CMD is a standard shell command, and the ENTRYPOINT is a wrapper that does first-time setup and then runs it. You can split your script into two:
#!/bin/sh
# spark-env.sh
. "${SPARK_BIN}/load-spark-env.snh"
exec "$#"
#!/bin/sh
# start-spark.sh
spark-class org.apache.spark.deploy.master.Master \
--ip "$SPARK_MASTER_HOST" \
--port "$SPARK_MASTER_PORT" \
--webui-port "$SPARK_MASTER_WEBUI_PORT"
Then in your Dockerfile specify both parts
COPY spark-env.sh start-spark.sh /
ENTRYPOINT ["/spark-env.sh"] # must be JSON-array syntax
CMD ["/start-spark.sh"] # or any other valid CMD
This is useful for your debugging since it's straightforward to override the CMD in a docker run or docker-compose run instruction, leaving the ENTRYPOINT in place.
docker-compose run spark \
printenv
This launches a new container based on all of the same Dockerfile setup. When it runs, it runs printenv instead of the CMD in the image. This will do the first-time setup in the ENTRYPOINT script, and then the final exec "$#" line will run printenv instead of starting the Spark application. This will show you the environment the application will have when it starts.

Auto-create Rundeck jobs on startup (Rundeck in Docker container)

I'm trying to setup Rundeck inside a Docker container. I want to use Rundeck to provision and manage my Docker fleet. I found an image which ships an ansible-plugin as well. So far running simple playbooks and auto-discovering my Pi nodes work.
Docker script:
echo "[INFO] prepare rundeck-home directory"
mkdir ../../target/work/home
mkdir ../../target/work/home/rundeck
mkdir ../../target/work/home/rundeck/data
echo -e "[INFO] copy host inventory to rundeck-home"
cp resources/inventory/hosts.ini ../../target/work/home/rundeck/data/inventory.ini
echo -e "[INFO] pull image"
docker pull batix/rundeck-ansible
echo -e "[INFO] start rundeck container"
docker run -d \
--name rundeck-raspi \
-p 4440:4440 \
-v "/home/sebastian/work/workspace/workspace-github/raspi/target/work/home/rundeck/data:/home/rundeck/data" \
batix/rundeck-ansible
Now I want to feed the container with playbooks which should become jobs to run in Rundeck. Can anyone give me a hint on how I can create Rundeck jobs (which should invoke an ansible playbook) from the outside? Via api?
One way I can think of is creating the jobs manually once and exporting them as XML or YAML. When the container and Rundeck is up and running I could import the jobs automatically. Is there a certain folder in rundeck-home or somewhere where I can put those files for automatic import? Or is there an API call or something?
Could Jenkins be more suited for this task than Rundeck?
EDIT: just changed to a Dockerfile
FROM batix/rundeck-ansible:latest
COPY resources/inventory/hosts.ini /home/rundeck/data/inventory.ini
COPY resources/realms.properties /home/rundeck/etc/realms.properties
COPY resources/tokens.properties /home/rundeck/etc/tokens.properties
# import jobs
ENV RD_URL="http://localhost:4440"
ENV RD_TOKEN="yJhbGciOiJIUzI1NiIs"
ENV rd_api="36"
ENV rd_project="Test-Project"
ENV rd_job_path="/home/rundeck/data/jobs"
ENV rd_job_file="Ping_Nodes.yaml"
# copy job definitions and script
COPY resources/jobs-definitions/Ping_Nodes.yaml /home/rundeck/data/jobs/Ping_Nodes.yaml
RUN curl -kSsv --header "X-Rundeck-Auth-Token:$RD_TOKEN" \
-F yamlBatch=#"$rd_job_path/$rd_job_file" "$RD_URL/api/$rd_api/project/$rd_project/jobs/import?fileformat=yaml&dupeOption=update"
Do you know how I can delay the curl at the end until after the rundeck service is up and running?
That's right you can design an script with an API call using cURL (pointing to your Docker instance) after deploying your instance (a script that deploys your instance and later import the jobs), I leave a basic example (in this example you need the job definition in XML format).
For XML job definition format:
#!/bin/sh
# protocol
protocol="http"
# basic rundeck info
rdeck_host="localhost"
rdeck_port="4440"
rdeck_api="36"
rdeck_token="qNcao2e75iMf1PmxYfUJaGEzuVOIW3Xz"
# specific api call info
rdeck_project="ProjectEXAMPLE"
rdeck_xml_file="HelloWorld.xml"
# api call
curl -kSsv --header "X-Rundeck-Auth-Token:$rdeck_token" \
-F xmlBatch=#"$rdeck_xml_file" "$protocol://$rdeck_host:$rdeck_port/api/$rdeck_api/project/$rdeck_project/jobs/import?fileformat=xml&dupeOption=update"
For YAML job definition format:
#!/bin/sh
# protocol
protocol="http"
# basic rundeck info
rdeck_host="localhost"
rdeck_port="4440"
rdeck_api="36"
rdeck_token="qNcao2e75iMf1PmxYfUJaGEzuVOIW3Xz"
# specific api call info
rdeck_project="ProjectEXAMPLE"
rdeck_yml_file="HelloWorldYML.yaml"
# api call
curl -kSsv --header "X-Rundeck-Auth-Token:$rdeck_token" \
-F xmlBatch=#"$rdeck_yml_file" "$protocol://$rdeck_host:$rdeck_port/api/$rdeck_api/project/$rdeck_project/jobs/import?fileformat=yaml&dupeOption=update"
Here the API call.

Is it possible to add an installer, run it and delete it during one build step in Docker?

I'm trying to create a Docker image from a pretty large installer binary (300+ MB). I want to add the installer to the image, install it, and delete the installer. This doesn't seem to be possible:
COPY huge-installer.bin /tmp
RUN /tmp/huge-installer.bin
RUN rm /tmp/huge-installer.bin # <- has no effect on the image size
Using multiple build stages doesn't seem to solve this, since I need to run the installer in the final image. If I could execute the installer directly from a previous build stage, without copying it, that would solve my problem, but as far as I know that's not possible.
Is there any way to avoid including the full weight of the installer in the final image?
I ended up solving this by using the built-in HTTP server in Python to make the project directory available to the image over HTTP.
Inside the Dockerfile, I can run commands like this, piping scripts directly to bash using curl:
RUN curl "http://127.0.0.1:${SERVER_PORT}/installer-${INSTALLER_VERSION}.bin" | bash
Or save binaries, run them and delete them in one step:
RUN curl -O "http://127.0.0.1:${SERVER_PORT}/binary-${INSTALLER_VERSION}.bin" && \
./binary-${INSTALLER_VERSION}.bin && \
rm binary-${INSTALLER_VERSION}.bin
I use a Makefile to start the server and stop it after the build, but you can use a build script instead.
Here's a Makefile example:
SHELL := bash
IMAGE_NAME := app-test
VERSION := 1.0.0
SERVER_PORT := 8580
.ONESHELL:
.PHONY: build
build:
# Kills the HTTP server when the build is done
function cleanup {
pkill -f "python3 -m http.server.*${SERVER_PORT}"
}
trap cleanup EXIT
# Starts a HTTP server that makes the contents of the project directory
# available to the image
python3 -m http.server -b 127.0.0.1 ${SERVER_PORT} &>/dev/null &
sleep 1
EXTRA_ARGS=""
# Allows skipping the build cache by setting NO_CACHE=1
if [[ -n $$NO_CACHE ]]; then
EXTRA_ARGS="--no-cache"
fi
docker build $$EXTRA_ARGS \
--network host \
--build-arg SERVER_PORT=${SERVER_PORT} \
-t ${IMAGE_NAME}:latest \
.
docker tag ${IMAGE_NAME}:latest ${IMAGE_NAME}:${VERSION}
I think the best way is to download the bin from a website then run it:
RUN wget http://myweb/huge-installer.bin && /tmp/huge-installer.bin && rm /tmp/huge-installer.bin
in this way your image layer will not contain the binary you download
I didn't test it thoroughly, but wouldn't such an approach be viable? (Besides LinPy's answer, which is way easier if you have the possibility to just do it that way.)
Dockerfile:
FROM alpine:latest
COPY entrypoint.sh /tmp/entrypoint.sh
RUN \
echo "I am an image that can run your huge installer binary!" \
&& echo "I will only function when you give it to me as a volume mount."
ENTRYPOINT [ "/tmp/entrypoint.sh" ]
entrypoint.sh:
#!/bin/sh
/tmp/your-installer # install your stuff here
while true; do
echo "installer finished, commit me now!"
sleep 5
done
Then run:
$ docker build -t foo-1
$ docker run --rm --name foo-1 --rm -d -v $(pwd)/your-installer:/tmp/your-installer
$ docker logs -f foo-1
# once it echoes "commit me now!", run the next command
$ docker commit foo-1 foo-2
$ docker stop foo-1
Since the installer was only mounted as a volume, the image foo-2 should not contain it anymore. You could also go and build another Dockerfile based on foo-2 to change the entrypoint, for example.
Cf. docker commit

Understanding the difference in sequence of ENTRYPOINT/CMD between Dockerfile and docker run

Docker noob here...
I am trying to build and run an IBM DataPower container from a Dockerfile, but it doesn't seem to work the same as when just running docker run and passing the same parameters in the terminal.
This works (docker run)
docker run -it \
-v $PWD/config:/drouter/config \
-e DATAPOWER_ACCEPT_LICENSE=true \
-e DATAPOWER_INTERACTIVE=true \
-e DATAPOWER_WORKER_THREADS=4 \
-p 9090:9090 \
--name mydatapower \
ibmcom/datapower
... the key part being that it mounts the ./config folder and the custom configuration is picked up by datapower running in the container.
This doesn't (Dockerfile)
Dockerfile:
FROM ibmcom/datapower
ENV DATAPOWER_ACCEPT_LICENSE=true
ENV DATAPOWER_INTERACTIVE=true
ENV DATAPOWER_WORKER_THREADS=4
EXPOSE 9090
COPY config/auto-startup.cfg /drouter/config/auto-startup.cfg
Build:
docker build -t local/datapower .
Run:
docker run -it \
-p 9090:9090 \
--name mydatapower local/datapower
The problem is that DataPower doesn't pick up the auto-startup.cfg file, so the additional config options doesn't get used. I know the source file path is correct because if I misspell the file name docker throws an error.
I have a theory that it might be running the inherited ENTRYPOINT or CMD before the config file is available. I don't know how to test or prove this. I don't know what the ENTRYPOINT or CMD is because the inherited image is not open source and I can't figure out how to find it.
Does that seem likely?
UPDATE:
The content of the auto-startup.cfg is:
top; co
ssh
web-mgmt
admin enabled
port 9090
exit
It simply enables the DataPower WebGUI.
The output when running it in the commandline with:
docker run -it -v $PWD/config:/drouter/config -v $PWD/local:/drouter/local -e DATAPOWER_ACCEPT_LICENSE=true -e DATAPOWER_INTERACTIVE=true -e DATAPOWER_WORKER_THREADS=4 -p 9091:9090 --name myconfigureddatapower ibmcom/datapower`
...contains this:
20170908T121729.015Z [0x8100006e][system][notice] : Executing startup configuration.
20170908T121729.970Z [0x00350014][mgmt][notice] web-mgmt(WebGUI-Settings): tid(303): Operational state up
...but with Dockerfile it doesn't. That's why I think the config files may be copied into place too late.
I've tried adding CMD ["/bin/drouter"] to the end of my Dockerfile to no avail.
I have tested your Dockerfile and it seems to be working. My auto-startup.cfg file is copied in the proper location and when I launch the container it's reading the file.
I get this output:
[root#ip-172-30-2-164 tmp]# docker run -ti -p 9090:9090 test
20170908T123728.818Z [0x8040006b][system][notice] logging target(default-log): Logging started.
20170908T123729.067Z [0x804000fe][system][notice] : Container instance UUID: 36bcca0e-6139-4694-91b0-2b7b66c3a498, Cores: 4, vCPUs: 4, CPU model: Intel(R) Xeon(R) CPU E5-2676 v3 # 2.40GHz, Memory: 16049.1MB, Platform: docker, OS: dpos, Edition: developers-limited, Up time: 0 minutes
20170908T123729.071Z [0x8040001c][system][notice] : DataPower IDG is on-line.
20170908T123729.071Z [0x8100006f][system][notice] : Executing default startup configuration.
20170908T123729.416Z [0x8100006d][system][notice] : Executing system configuration.
20170908T123729.417Z [0x8100006b][mgmt][notice] domain(default): tid(8143): Domain operational state is up.
708f98be1390
Unauthorized access prohibited.
20170908T123731.239Z [0x806000dd][system][notice] cert-monitor(Certificate Monitor): tid(399): Enabling Certificate Monitor to scan once every 1 days for soon to expire certificates
20170908T123731.552Z [0x8100006e][system][notice] : Executing startup configuration.
20170908T123732.436Z [0x8100003b][mgmt][notice] domain(default): Domain configured successfully.
20170908T123732.449Z [0x00350014][mgmt][notice] web-mgmt(WebGUI-Settings): tid(303): Operational state up
login:
To check that your file has been copied to the container you can run docker run -ti local/datapower sh to enter the container and then check the content of /drouter/config/.
Your base image command is: CMD ["/bin/drouter"] you can check it running docker history ibmcom/datapower.
UPDATE:
The drouter user in the container must be able to read the auto-startup.cfg file. You have 2 options:
set your local auto-startup.cfg with the proper permissions (chmod 644 config/autostart.cfg).
or add these line in the Dockerfile so drouter can read the file:
USER root
RUN chown drouter /drouter/config/auto-startup.cfg
USER drouter

Simple Shell Script: Dead Docker Containers

I have a very simple Docker container that runs a bash shell script that returns something. My Dockerfile:
# Docker image to get stats from a rest interface using CURL and JSON parsing
FROM ubuntu
RUN apt-get update
# Install curl and jq, a lightweight command-line JSON processor
RUN apt-get install -y curl jq
COPY ./stats.sh /
# Make sure script has execute permissions for root
RUN chmod 500 stats.sh
# Define a custom entrypoint to execute stats commands easily within the container,
# using environment substitution and the like...
ENTRYPOINT ["/stats.sh"]
CMD ["info"]
The stats.sh looks like this:
#!/bin/bash
# ElasticSearch
## Get the total size of the elasticsearch DB in bytes
## Requires the elasticsearch container to be linked with alias 'elasticsearch'
function es_size() {
local size=$(curl $ELASTICSEARCH_PORT_9200_TCP_ADDR:$ELASTICSEARCH_PORT_9200_TCP_PORT/_stats/_all 2>/dev/null|jq ._all.total.store.size_in_bytes)
echo $size
}
if [[ "$1" == "info" ]]; then
echo "Check stats.sh for available commands"
elif [[ "$1" == "es_size" ]]; then
es_size
else
echo "Unknown command: $#"
fi
So basically, I have a Docker container that I will run with --rm to exit immediately after running and returning the value I want. More precise, I run it from another shell script (in the host) with:
local size=$(docker run --name stats-es-size --rm --link $esName:elasticsearch $ENV_DOCKER_REST_STATS_IMAGE:$ENV_DOCKER_REST_STATS_VERSION es_size)
Now I'm running this periodically to gather statistics, once a minute. While it works well in general, I end up getting containers with status Dead about once a day.
Can anybody tell me what I might be doing wrong? Is there some problem with my approach or why do my containers die with a certain frequency?

Resources