How to Import Streamsets pipeline in Dockerfile without container exiting - docker

I am trying to import a pipeline into streamsets, during container start up, by using the Docker CMD command in Dockerfile. The image builds, but while creating the container there is no error but it exits with code 0. So it never comes up. Here is what I did:
Dockerfile:
FROM streamsets/datacollector:3.18.1
COPY myPipeline.json /pipelinejsonlocation/
EXPOSE 18630
ENTRYPOINT ["/bin/sh"]
CMD ["/opt/streamsets-datacollector-3.18.1/bin/streamsets","cli","-U", "http://localhost:18630", \
"-u", \
"admin", \
"-p", \
"admin", \
"store", \
"import", \
"-n", \
"myPipeline", \
"--stack", \
"-f", \
"/pipelinejsonlocation/myPipeline.json"]
Build image:
docker build -t cmp/sdc .
Run image:
docker run -p 18630:18630 -d --name sdc cmp/sdc
This outputs the container id. But the container is in the Exited status as shown below.
docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
537adb1b05ab cmp/sdc "/bin/sh /opt/stream…" 5 seconds ago Exited (0) 3 seconds ago sdc
When I do not specify the CMD command in the Dockerfile, the streamsets container spins up and then when I run the streamsets import command in the running container in shell, it works. But how do I get it done during provisioning itself? Is there something I am missing in the Dockerfile?

In your Dockerfile you overwrite the default CMD and ENTRYPOINT from the StreamSets Data Collector Dockerfile. So the container only executes your command during startup and exits without errors afterwards. This is the reason why your container is in Exited (0) status.
In general this is good and expected behavior. If you want to keep your container alive you need to execute another command in the foreground, which never ends. But unfortunately, you cannot run multiple CMDs in your docker file.
I dug a little deeper. The default entry point of the image is ENTRYPOINT ["/docker-entrypoint.sh"]. This script sets up a few things and starts the Data Collector.
It is required that the Data Collector is running before the pipeline is imported. So a solution could be to copy the default docker-entrypoint.sh and modify it to start the Data Collector and import the pipeline afterwards. You could to it like this:
Dockerfile:
FROM streamsets/datacollector:3.18.1
COPY myPipeline.json /pipelinejsonlocation/
# Replace docker-entrypoint.sh
COPY docker-entrypoint.sh /docker-entrypoint.sh
EXPOSE 18630
docker-entrypoint.sh (https://github.com/streamsets/datacollector-docker/blob/master/docker-entrypoint.sh):
#!/bin/bash
#
# Copyright 2017 StreamSets Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
set -e
# We translate environment variables to sdc.properties and rewrite them.
set_conf() {
if [ $# -ne 2 ]; then
echo "set_conf requires two arguments: <key> <value>"
exit 1
fi
if [ -z "$SDC_CONF" ]; then
echo "SDC_CONF is not set."
exit 1
fi
grep -q "^$1" ${SDC_CONF}/sdc.properties && sed 's|^#\?\('"$1"'=\).*|\1'"$2"'|' -i ${SDC_CONF}/sdc.properties || echo -e "\n$1=$2" >> ${SDC_CONF}/sdc.properties
}
# support arbitrary user IDs
# ref: https://docs.openshift.com/container-platform/3.3/creating_images/guidelines.html#openshift-container-platform-specific-guidelines
if ! whoami &> /dev/null; then
if [ -w /etc/passwd ]; then
echo "${SDC_USER:-sdc}:x:$(id -u):0:${SDC_USER:-sdc} user:${HOME}:/sbin/nologin" >> /etc/passwd
fi
fi
# In some environments such as Marathon $HOST and $PORT0 can be used to
# determine the correct external URL to reach SDC.
if [ ! -z "$HOST" ] && [ ! -z "$PORT0" ] && [ -z "$SDC_CONF_SDC_BASE_HTTP_URL" ]; then
export SDC_CONF_SDC_BASE_HTTP_URL="http://${HOST}:${PORT0}"
fi
for e in $(env); do
key=${e%=*}
value=${e#*=}
if [[ $key == SDC_CONF_* ]]; then
lowercase=$(echo $key | tr '[:upper:]' '[:lower:]')
key=$(echo ${lowercase#*sdc_conf_} | sed 's|_|.|g')
set_conf $key $value
fi
done
# MODIFICATIONS:
#exec "${SDC_DIST}/bin/streamsets" "$#"
check_data_collector_status () {
watch -n 1 ${SDC_DIST}/bin/streamsets cli -U http://localhost:18630 ping | grep -q 'version' && echo "Data Collector has started!" && import_pipeline
}
function import_pipeline () {
sleep 1
echo "Start to import pipeline"
${SDC_DIST}/bin/streamsets cli -U http://localhost:18630 -u admin -p admin store import -n myPipeline --stack -f /pipelinejsonlocation/myPipeline.json
echo "Finished importing pipeline"
}
# Start checking if Data Collector is up (in background) and start Data Collector
check_data_collector_status & ${SDC_DIST}/bin/streamsets $#
I commented out the last line exec "${SDC_DIST}/bin/streamsets" "$#" of the default docker-entrypoint.sh and added two functions. check_data_collector_status () pings the Data Collector service until it is available. import_pipeline () imports your pipeline.
check_data_collector_status () runs in background and ${SDC_DIST}/bin/streamsets $# is started in foreground as before. So the pipeline is imported after the Data Collector service is started.

Run this image with sleep command:
docker run -p 18630:18630 -d --name sdc cmp/sdc sleep 300
300 is the time to sleep in seconds.
Then exec your script manually within the docker container and find out what's wrong.

Related

How can we run flyway/migrations script inside Cassandra Dockerfile?

My docker file
FROM cassandra:4.0
MAINTAINER me
EXPOSE 9042
I want to run something like when cassandra image is fetched and super user is made inside container.
create keyspace IF NOT EXISTS XYZ WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };
I have also tried out adding a shell script but it never connects to cassandra, my modified docker file is
FROM cassandra:4.0
MAINTAINER me
ADD entrypoint.sh /usr/local/bin/entrypoint.sh
RUN chmod 755 /usr/local/bin/entrypoint.sh
RUN mkdir scripts
COPY alter.cql scripts/
RUN chmod 755 scripts/alter.cql
EXPOSE 9042
CMD ["entrypoint.sh"]
My entrypoint looks like this
#!/bin/bash
export CQLVERSION=${CQLVERSION:-"4.0"}
export CQLSH_HOST=${CQLSH_HOST:-"localhost"}
export CQLSH_PORT=${CQLSH_PORT:-"9042"}
cqlsh=( cqlsh --cqlversion ${CQLVERSION} )
# test connection to cassandra
echo "Checking connection to cassandra..."
for i in {1..30}; do
if "${cqlsh[#]}" -e "show host;" 2> /dev/null; then
break
fi
echo "Can't establish connection, will retry again in $i seconds"
sleep $i
done
if [ "$i" = 30 ]; then
echo >&2 "Failed to connect to cassandra at ${CQLSH_HOST}:${CQLSH_PORT}"
exit 1
fi
# iterate over the cql files in /scripts folder and execute each one
for file in /scripts/*.cql; do
[ -e "$file" ] || continue
echo "Executing $file..."
"${cqlsh[#]}" -f "$file"
done
echo "Done."
exit 0
This never connects to my cassandra
Any ideas please help.
Thanks .
Your problem is that you don't call the original entrypoint to start Cassandra - you overwrote it with your own code, but it just running the cqlsh, without starting Cassandra.
You need to modify your code to start Cassandra using the original entrypoint script (source) that is installed as /usr/local/bin/docker-entrypoint.sh, and then execute your script, and then wait for termination signal (you can't just exit from your script, because it will terminate the image.

crond isn't running unless I ssh to the container [duplicate]

I am trying to run a cronjob inside a docker container that invokes a shell script.
Yesterday I have been searching all over the web and stack overflow, but I could not really find a solution that works.
How can I do this?
You can copy your crontab into an image, in order for the container launched from said image to run the job.
Important: as noted in docker-cron issue 3: use LF, not CRLF for your cron file.
See "Run a cron job with Docker" from Julien Boulay in his Ekito/docker-cron:
Let’s create a new file called "hello-cron" to describe our job.
# must be ended with a new line "LF" (Unix) and not "CRLF" (Windows)
* * * * * echo "Hello world" >> /var/log/cron.log 2>&1
# An empty line is required at the end of this file for a valid cron file.
If you are wondering what is 2>&1, Ayman Hourieh explains.
The following Dockerfile describes all the steps to build your image
FROM ubuntu:latest
MAINTAINER docker#ekito.fr
RUN apt-get update && apt-get -y install cron
# Copy hello-cron file to the cron.d directory
COPY hello-cron /etc/cron.d/hello-cron
# Give execution rights on the cron job
RUN chmod 0644 /etc/cron.d/hello-cron
# Apply cron job
RUN crontab /etc/cron.d/hello-cron
# Create the log file to be able to run tail
RUN touch /var/log/cron.log
# Run the command on container startup
CMD cron && tail -f /var/log/cron.log
But: if cron dies, the container keeps running.
(see Gaafar's comment and How do I make apt-get install less noisy?:
apt-get -y install -qq --force-yes cron can work too)
As noted by Nathan Lloyd in the comments:
Quick note about a gotcha:
If you're adding a script file and telling cron to run it, remember to
RUN chmod 0744 /the_script
Cron fails silently if you forget.
OR, make sure your job itself redirect directly to stdout/stderr instead of a log file, as described in hugoShaka's answer:
* * * * * root echo hello > /proc/1/fd/1 2>/proc/1/fd/2
Replace the last Dockerfile line with
CMD ["cron", "-f"]
But: it doesn't work if you want to run tasks as a non-root.
See also (about cron -f, which is to say cron "foreground") "docker ubuntu cron -f is not working"
Build and run it:
sudo docker build --rm -t ekito/cron-example .
sudo docker run -t -i ekito/cron-example
Be patient, wait for 2 minutes and your command-line should display:
Hello world
Hello world
Eric adds in the comments:
Do note that tail may not display the correct file if it is created during image build.
If that is the case, you need to create or touch the file during container runtime in order for tail to pick up the correct file.
See "Output of tail -f at the end of a docker CMD is not showing".
See more in "Running Cron in Docker" (Apr. 2021) from Jason Kulatunga, as he commented below
See Jason's image AnalogJ/docker-cron based on:
Dockerfile installing cronie/crond, depending on distribution.
an entrypoint initializing /etc/environment and then calling
cron -f -l 2
The accepted answer may be dangerous in a production environment.
In docker you should only execute one process per container because if you don't, the process that forked and went background is not monitored and may stop without you knowing it.
When you use CMD cron && tail -f /var/log/cron.log the cron process basically fork in order to execute cron in background, the main process exits and let you execute tailf in foreground. The background cron process could stop or fail you won't notice, your container will still run silently and your orchestration tool will not restart it.
You can avoid such a thing by redirecting directly the cron's commands output into your docker stdout and stderr which are located respectively in /proc/1/fd/1 and /proc/1/fd/2.
Using basic shell redirects you may want to do something like this :
* * * * * root echo hello > /proc/1/fd/1 2>/proc/1/fd/2
And your CMD will be : CMD ["cron", "-f"]
But: this doesn't work if you want to run tasks as a non-root.
For those who wants to use a simple and lightweight image:
FROM alpine:3.6
# copy crontabs for root user
COPY config/cronjobs /etc/crontabs/root
# start crond with log level 8 in foreground, output to stderr
CMD ["crond", "-f", "-d", "8"]
Where cronjobs is the file that contains your cronjobs, in this form:
* * * * * echo "hello stackoverflow" >> /test_file 2>&1
# remember to end this file with an empty new line
But apparently you won't see hello stackoverflow in docker logs.
What #VonC has suggested is nice but I prefer doing all cron job configuration in one line. This would avoid cross platform issues like cronjob location and you don't need a separate cron file.
FROM ubuntu:latest
# Install cron
RUN apt-get -y install cron
# Create the log file to be able to run tail
RUN touch /var/log/cron.log
# Setup cron job
RUN (crontab -l ; echo "* * * * * echo "Hello world" >> /var/log/cron.log") | crontab
# Run the command on container startup
CMD cron && tail -f /var/log/cron.log
After running your docker container, you can make sure if cron service is working by:
# To check if the job is scheduled
docker exec -ti <your-container-id> bash -c "crontab -l"
# To check if the cron service is running
docker exec -ti <your-container-id> bash -c "pgrep cron"
If you prefer to have ENTRYPOINT instead of CMD, then you can substitute the CMD above with
ENTRYPOINT cron start && tail -f /var/log/cron.log
But: if cron dies, the container keeps running.
There is another way to do it, is to use Tasker, a task runner that has cron (a scheduler) support.
Why ? Sometimes to run a cron job, you have to mix, your base image (python, java, nodejs, ruby) with the crond. That means another image to maintain. Tasker avoid that by decoupling the crond and you container. You can just focus on the image that you want to execute your commands, and configure Tasker to use it.
Here an docker-compose.yml file, that will run some tasks for you
version: "2"
services:
tasker:
image: strm/tasker
volumes:
- "/var/run/docker.sock:/var/run/docker.sock"
environment:
configuration: |
logging:
level:
ROOT: WARN
org.springframework.web: WARN
sh.strm: DEBUG
schedule:
- every: minute
task: hello
- every: minute
task: helloFromPython
- every: minute
task: helloFromNode
tasks:
docker:
- name: hello
image: debian:jessie
script:
- echo Hello world from Tasker
- name: helloFromPython
image: python:3-slim
script:
- python -c 'print("Hello world from python")'
- name: helloFromNode
image: node:8
script:
- node -e 'console.log("Hello from node")'
There are 3 tasks there, all of them will run every minute (every: minute), and each of them will execute the script code, inside the image defined in image section.
Just run docker-compose up, and see it working. Here is the Tasker repo with the full documentation:
http://github.com/opsxcq/tasker
Though this aims to run jobs beside a running process in a container via Docker's exec interface, this may be of interest for you.
I've written a daemon that observes containers and schedules jobs, defined in their metadata, on them. Example:
version: '2'
services:
wordpress:
image: wordpress
mysql:
image: mariadb
volumes:
- ./database_dumps:/dumps
labels:
deck-chores.dump.command: sh -c "mysqldump --all-databases > /dumps/dump-$$(date -Idate)"
deck-chores.dump.interval: daily
'Classic', cron-like configuration is also possible.
Here are the docs, here's the image repository.
VonC's answer is pretty thorough. In addition I'd like to add one thing that helped me. If you just want to run a cron job without tailing a file, you'd be tempted to just remove the && tail -f /var/log/cron.log from the cron command.
However this will cause the Docker container to exit shortly after running because when the cron command completes, Docker thinks the last command has exited and hence kills the container. This can be avoided by running cron in the foreground via cron -f.
If you're using docker for windows, remember that you have to change your line-ending format from CRLF to LF (i.e. from dos to unix) if you intend on importing your crontab file from windows to your ubuntu container. If not, your cron-job won't work. Here's a working example:
FROM ubuntu:latest
RUN apt-get update && apt-get -y install cron
RUN apt-get update && apt-get install -y dos2unix
# Add crontab file (from your windows host) to the cron directory
ADD cron/hello-cron /etc/cron.d/hello-cron
# Change line ending format to LF
RUN dos2unix /etc/cron.d/hello-cron
# Give execution rights on the cron job
RUN chmod 0644 /etc/cron.d/hello-cron
# Apply cron job
RUN crontab /etc/cron.d/hello-cron
# Create the log file to be able to run tail
RUN touch /var/log/hello-cron.log
# Run the command on container startup
CMD cron && tail -f /var/log/hello-cron.log
This actually took me hours to figure out, as debugging cron jobs in docker containers is a tedious task. Hope it helps anyone else out there that can't get their code to work!
But: if cron dies, the container keeps running.
I created a Docker image based on the other answers, which can be used like
docker run -v "/path/to/cron:/etc/cron.d/crontab" gaafar/cron
where /path/to/cron: absolute path to crontab file, or you can use it as a base in a Dockerfile:
FROM gaafar/cron
# COPY crontab file in the cron directory
COPY crontab /etc/cron.d/crontab
# Add your commands here
For reference, the image is here.
Unfortunately, none of the above answers worked for me, although all answers lead to the solution and eventually to my solution, here is the snippet if it helps someone. Thanks
This can be solved with the bash file, due to the layered architecture of the Docker, cron service doesn't get initiated with RUN/CMD/ENTRYPOINT commands.
Simply add a bash file which will initiate the cron and other services (if required)
DockerFile
FROM gradle:6.5.1-jdk11 AS build
# apt
RUN apt-get update
RUN apt-get -y install cron
# Setup cron to run every minute to print (you can add/update your cron here)
RUN touch /var/log/cron-1.log
RUN (crontab -l ; echo "* * * * * echo testing cron.... >> /var/log/cron-1.log 2>&1") | crontab
# entrypoint.sh
RUN chmod +x entrypoint.sh
CMD ["bash","entrypoint.sh"]
entrypoint.sh
#!/bin/sh
service cron start & tail -f /var/log/cron-2.log
If any other service is also required to run along with cron then add that service with & in the same command, for example: /opt/wildfly/bin/standalone.sh & service cron start & tail -f /var/log/cron-2.log
Once you will get into the docker container there you can see that testing cron.... will be getting printed every minute in file: /var/log/cron-1.log
But, if cron dies, the container keeps running.
Define the cronjob in a dedicated container which runs the command via docker exec to your service.
This is higher cohesion and the running script will have access to the environment variables you have defined for your service.
#docker-compose.yml
version: "3.3"
services:
myservice:
environment:
MSG: i'm being cronjobbed, every minute!
image: alpine
container_name: myservice
command: tail -f /dev/null
cronjobber:
image: docker:edge
volumes:
- /var/run/docker.sock:/var/run/docker.sock
container_name: cronjobber
command: >
sh -c "
echo '* * * * * docker exec myservice printenv | grep MSG' > /etc/crontabs/root
&& crond -f"
I decided to use busybox, as it is one of the smallest images.
crond is executed in foreground (-f), logging is send to stderr (-d), I didn't choose to change the loglevel.
crontab file is copied to the default path: /var/spool/cron/crontabs
FROM busybox:1.33.1
# Usage: crond [-fbS] [-l N] [-d N] [-L LOGFILE] [-c DIR]
#
# -f Foreground
# -b Background (default)
# -S Log to syslog (default)
# -l N Set log level. Most verbose 0, default 8
# -d N Set log level, log to stderr
# -L FILE Log to FILE
# -c DIR Cron dir. Default:/var/spool/cron/crontabs
COPY crontab /var/spool/cron/crontabs/root
CMD [ "crond", "-f", "-d" ]
But output of the tasks apparently can't be seen in docker logs.
When you deploy your container on another host, just note that it won't start any processes automatically. You need to make sure that 'cron' service is running inside your container.
In our case, I am using Supervisord with other services to start cron service.
[program:misc]
command=/etc/init.d/cron restart
user=root
autostart=true
autorestart=true
stderr_logfile=/var/log/misc-cron.err.log
stdout_logfile=/var/log/misc-cron.out.log
priority=998
From above examples I created this combination:
Alpine Image & Edit Using Crontab in Nano (I hate vi)
FROM alpine
RUN apk update
RUN apk add curl nano
ENV EDITOR=/usr/bin/nano
# start crond with log level 8 in foreground, output to stderr
CMD ["crond", "-f", "-d", "8"]
# Shell Access
# docker exec -it <CONTAINERID> /bin/sh
# Example Cron Entry
# crontab -e
# * * * * * echo hello > /proc/1/fd/1 2>/proc/1/fd/2
# DATE/TIME WILL BE IN UTC
Setup a cron in parallel to a one-time job
Create a script file, say run.sh, with the job that is supposed to run periodically.
#!/bin/bash
timestamp=`date +%Y/%m/%d-%H:%M:%S`
echo "System path is $PATH at $timestamp"
Save and exit.
Use Entrypoint instead of CMD
f you have multiple jobs to kick in during docker containerization, use the entrypoint file to run them all.
Entrypoint file is a script file that comes into action when a docker run command is issued. So, all the steps that we want to run can be put in this script file.
For instance, we have 2 jobs to run:
Run once job: echo “Docker container has been started”
Run periodic job: run.sh
Create entrypoint.sh
#!/bin/bash
# Start the run once job.
echo "Docker container has been started"
# Setup a cron schedule
echo "* * * * * /run.sh >> /var/log/cron.log 2>&1
# This extra line makes it a valid cron" > scheduler.txt
crontab scheduler.txt
cron -f
Let’s understand the crontab that has been set up in the file
* * * * *: Cron schedule; the job must run every minute. You can update the schedule based on your requirement.
/run.sh: The path to the script file which is to be run periodically
/var/log/cron.log: The filename to save the output of the scheduled cron job.
2>&1: The error logs(if any) also will be redirected to the same output file used above.
Note: Do not forget to add an extra new line, as it makes it a valid cron.
Scheduler.txt: the complete cron setup will be redirected to a file.
Using System/User specific environment variables in cron
My actual cron job was expecting most of the arguments as the environment variables passed to the docker run command. But, with bash, I was not able to use any of the environment variables that belongs to the system or the docker container.
Then, this came up as a walkaround to this problem:
Add the following line in the entrypoint.sh
declare -p | grep -Ev 'BASHOPTS|BASH_VERSINFO|EUID|PPID|SHELLOPTS|UID' > /container.env
Update the cron setup and specify-
SHELL=/bin/bash
BASH_ENV=/container.env
At last, your entrypoint.sh should look like
#!/bin/bash
# Start the run once job.
echo "Docker container has been started"
declare -p | grep -Ev 'BASHOPTS|BASH_VERSINFO|EUID|PPID|SHELLOPTS|UID' > /container.env
# Setup a cron schedule
echo "SHELL=/bin/bash
BASH_ENV=/container.env
* * * * * /run.sh >> /var/log/cron.log 2>&1
# This extra line makes it a valid cron" > scheduler.txt
crontab scheduler.txt
cron -f
Last but not the least: Create a Dockerfile
FROM ubuntu:16.04
MAINTAINER Himanshu Gupta
# Install cron
RUN apt-get update && apt-get install -y cron
# Add files
ADD run.sh /run.sh
ADD entrypoint.sh /entrypoint.sh
RUN chmod +x /run.sh /entrypoint.sh
ENTRYPOINT /entrypoint.sh
That’s it. Build and run the Docker image!
Here's my docker-compose based solution:
cron:
image: alpine:3.10
command: crond -f -d 8
depends_on:
- servicename
volumes:
- './conf/cron:/etc/crontabs/root:z'
restart: unless-stopped
the lines with cron entries are on the ./conf/cron file.
Note: this won't run commands that aren't in the alpine image.
Also, output of the tasks apparently won't appear in docker logs.
This question have a lot of answers, but some are complicated and another has some drawbacks. I try to explain the problems and try to deliver a solution.
cron-entrypoint.sh:
#!/bin/bash
# copy machine environment variables to cron environment
printenv | cat - /etc/crontab > temp && mv temp /etc/crontab
## validate cron file
crontab /etc/crontab
# cron service with SIGTERM support
service cron start
trap "service cron stop; exit" SIGINT SIGTERM
# just dump your logs to std output
tail -f \
/app/storage/logs/laravel.log \
/var/log/cron.log \
& wait $!
Problems solved
environment variables are not available on cron environment (like env vars or kubernetes secrets)
stop when crontab file is not valid
stop gracefully cron jobs when machine receive an SIGTERM signal
For context, I use previous script on Kubernetes with Laravel app.
this line was the one that helped me run my pre-scheduled task.
ADD mycron/root /etc/cron.d/root
RUN chmod 0644 /etc/cron.d/root
RUN crontab /etc/cron.d/root
RUN touch /var/log/cron.log
CMD ( cron -f -l 8 & ) && apache2-foreground # <-- run cron
--> My project run inside: FROM php:7.2-apache
But: if cron dies, the container keeps running.
When running on some trimmed down images that restrict root access, I had to add my user to the sudoers and run as sudo cron
FROM node:8.6.0
RUN apt-get update && apt-get install -y cron sudo
COPY crontab /etc/cron.d/my-cron
RUN chmod 0644 /etc/cron.d/my-cron
RUN touch /var/log/cron.log
# Allow node user to start cron daemon with sudo
RUN echo 'node ALL=NOPASSWD: /usr/sbin/cron' >>/etc/sudoers
ENTRYPOINT sudo cron && tail -f /var/log/cron.log
Maybe that helps someone
But: if cron dies, the container keeps running.
So, my problem was the same. The fix was to change the command section in the docker-compose.yml.
From
command: crontab /etc/crontab && tail -f /etc/crontab
To
command: crontab /etc/crontab
command: tail -f /etc/crontab
The problem was the '&&' between the commands. After deleting this, it was all fine.
Focusing on gracefully stopping the cronjobs when receiving SIGTERM or SIGQUIT signals (e.g. when running docker stop).
That's not too easy. By default, the cron process just got killed without paying attention to running cronjobs. I'm elaborating on pablorsk's answer:
Dockerfile:
FROM ubuntu:latest
RUN apt-get update \
&& apt-get -y install cron procps \
&& rm -rf /var/lib/apt/lists/*
# Copy cronjobs file to the cron.d directory
COPY cronjobs /etc/cron.d/cronjobs
# Give execution rights on the cron job
RUN chmod 0644 /etc/cron.d/cronjobs
# similarly prepare the default cronjob scripts
COPY run_cronjob.sh /root/run_cronjob.sh
RUN chmod +x /root/run_cronjob.sh
COPY run_cronjob_without_log.sh /root/run_cronjob_without_log.sh
RUN chmod +x /root/run_cronjob_without_log.sh
# Apply cron job
RUN crontab /etc/cron.d/cronjobs
# to gain access to environment variables, we need this additional entrypoint script
COPY entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh
# optionally, change received signal from SIGTERM TO SIGQUIT
#STOPSIGNAL SIGQUIT
# Run the command on container startup
ENTRYPOINT ["/entrypoint.sh"]
entrypoint.sh:
#!/bin/bash
# make global environment variables available within crond, too
printenv | grep -v "no_proxy" >> /etc/environment
# SIGQUIT/SIGTERM-handler
term_handler() {
echo 'stopping cron'
service cron stop
echo 'stopped'
echo 'waiting'
x=$(($(ps u -C run_cronjob.sh | wc -l)-1))
xold=0
while [ "$x" -gt 0 ]
do
if [ "$x" != "$xold" ]; then
echo "Waiting for $x running cronjob(s):"
ps u -C run_cronjob.sh
xold=$x
sleep 1
fi
x=$(($(ps u -C run_cronjob.sh | wc -l)-1))
done
echo 'done waiting'
exit 143; # 128 + 15 -- SIGTERM
}
# cron service with SIGTERM and SIGQUIT support
service cron start
trap "term_handler" QUIT TERM
# endless loop
while true
do
tail -f /dev/null & wait ${!}
done
cronjobs
* * * * * ./run_cronjob.sh cron1
*/2 * * * * ./run_cronjob.sh cron2
*/3 * * * * ./run_cronjob.sh cron3
Assuming you wrap all your cronjobs in a run_cronjob.sh script. That way, you can execute arbitrary code for which shutdown will wait gracefully.
run_cronjobs.sh (optional helper script to keep cronjob definitions clean)
#!/bin/bash
DIR_INCL="${BASH_SOURCE%/*}"
if [[ ! -d "$DIR_INCL" ]]; then DIR_INCL="$PWD"; fi
cd "$DIR_INCL"
# redirect all cronjob output to docker
./run_cronjob_without_log.sh "$#" > /proc/1/fd/1 2>/proc/1/fd/2
run_cronjob_without_log.sh
your_actual_cronjob_src()
Btw, when receiving a SIGKILL the container still shut downs immediately. That way you can use a command like docker-compose stop -t 60 cron-container to wait 60s for cronjobs to finish gracefully, but still terminate them for sure after the timeout.
All the answers require root access inside the container because 'cron' itself requests for UID 0.
To request root acces (e.g. via sudo) is against docker best practices.
I used https://github.com/gjcarneiro/yacron to manage scheduled tasks.
I occasionally tried to find a docker-friendly cron implementation. And this last time I tried, I've found a couple.
By docker-friendly I mean, "output of the tasks can be seen in docker logs w/o resorting to tricks."
The most promising I see at the moment is supercronic. It can be fed a crontab file, all while being docker-friendly. To make use of it:
docker-compose.yml:
services:
supercronic:
build: .
command: supercronic crontab
Dockerfile:
FROM alpine:3.17
RUN set -x \
&& apk add --no-cache supercronic shadow \
&& useradd -m app
USER app
COPY crontab .
crontab:
* * * * * date
A gist with a bit more info.
Another good one is yacron, but it uses YAML.
ofelia can be used, but they seem to focus on running tasks in separate containers. Which is probably not a downside, but I'm not sure why I'd want to do that.
And there's also a number of traditional cron implementations: dcron, fcron, cronie. But they come with "no easy way to see output of the tasks."
Just adding to the list of answers that you can also use this image:
https://hub.docker.com/repository/docker/cronit/simple-cron
And use it as a basis to start cron jobs, using it like this:
FROM cronit/simple-cron # Inherit from the base image
#Set up all your dependencies
COPY jobs.cron ./ # Copy your local config
Evidently, it is possible to run cron as a process inside the container (under root user) alongside other processes , using ENTRYPOINT statement in Dockerfile with start.sh script what includes line process cron start. More info here
#!/bin/bash
# copy environment variables for local use
env >> etc/environment
# start cron service
service cron start
# start other service
service other start
#...
If your image doesn't contain any daemon (so it's only the short-running script or process), you may also consider starting your cron from outside, by simply defining a LABEL with the cron information, plus the scheduler itself. This way, your default container state is "exited". If you have multiple scripts, this may result in a lower footprint on your system than having multiple parallel-running cron instances.
See: https://github.com/JaciBrunning/docker-cron-label
Example docker-compose.yaml:
version: '3.8'
# Example application of the cron image
services:
cron:
image: jaci/cron-label:latest
volumes:
- "/var/run/docker.sock:/var/run/docker.sock"
- "/etc/localtime:/etc/localtime:ro"
hello:
image: hello-world
restart: "no"
labels:
- "cron.schedule=* * * * * "
I wanted to share a modification to the typical off of some of these other suggestions that I found more flexible. I wanted to enable changing the cron time with an environment variable and ended up adding an additional script that runs within my entrypoint.sh, but before the call to cron -f
*updatecron.sh*
#!/bin/sh
#remove old cron files
rm -rf /etc/cron.*/*
#create a new formatted cron definition
echo "$crondef [appname] >/proc/1/fd/1 2>/proc/1/fd/2" >> /etc/cron.d/restart-cron
echo \ >> /etc/cron.d/restart-cron
chmod 0644 /etc/cron.d/restart-cron
crontab /etc/cron.d/restart-cron
This removes any existing cron files, creates a new cronfile using an ENV variable of crondef, and then loads it.
Our's was a nodejs application to be run as cron job and it was also dependent on environment variables.
The below solution worked for us.
Docker file:
# syntax=docker/dockerfile:1
FROM node:12.18.1
ENV NODE_ENV=production
COPY ["startup.sh", "./"]
# Removed steps to build the node js application
#--------------- Setup cron ------------------
# Install Cron
RUN apt-get update
RUN apt-get -y install cron
# Run every day at 1AM
#/proc/1/fd/1 2>/proc/1/fd/2 is used to redirect cron logs to standard output and standard error
RUN (crontab -l ; echo "0 1 * * * /usr/local/bin/node /app/dist/index.js > /proc/1/fd/1 2>/proc/1/fd/2") | crontab
#--------------- Start Cron ------------------
# Grant execution rights
RUN chmod 755 startup.sh
CMD ["./startup.sh"]
startup.sh:
!/bin/bash
echo "Copying env variables to /etc/environment so that it is available for cron jobs"
printenv >> /etc/environment
echo "Starting cron"
cron -f
With multiple jobs and various dependencies like zsh and curl, this is a good approach while also combining the best practices from other answers. Bonus: This does NOT require you to set +x execution permissions on myScript.sh, which can be easy to miss in a new environment.
cron.dockerfile
FROM ubuntu:latest
# Install dependencies
RUN apt-get update && apt-get -y install \
cron \
zsh \
curl;
# Setup multiple jobs with zsh and redirect outputs to docker logs
RUN (echo "\
* * * * * zsh -c 'echo "Hello World"' 1> /proc/1/fd/1 2>/proc/1/fd/2 \n\
* * * * * zsh /myScript.sh 1> /proc/1/fd/1 2>/proc/1/fd/2 \n\
") | crontab
# Run cron in forground, so docker knows the task is running
CMD ["cron", "-f"]
Integrate this with docker compose like so:
docker-compose.yml
services:
cron:
build:
context: .
dockerfile: ./cron.dockerfile
volumes:
- ./myScript.sh:/myScript.sh
Keep in mind that you need to docker compose build cron when you change contents of the cron.dockerfile, but changes to myScript.sh will be reflected right away as it's mounted in compose.

Inject SSH key into a Docker container

I am trying to find a "global" solution for injecting an SSH key into a container. I know that there are several solutions including docker build kit and so on...but I don't want to build an image and inject the SSH key. I want to inject the SSH key by using an existing image with docker compose.
I use the following docker compose file:
version: '3.1'
services:
server1:
image: XXXXXXX
container_name: server1
command: bash -c "/root/init.sh && python3 /root/my_python.py"
environment:
- MANAGED_HOST=mserver
volumes:
- ./init.sh:/root/init.sh
secrets:
- id_rsa
secrets:
id_rsa:
file: /home/user/.ssh/id_rsa
The init.sh is as follows:
#!/bin/bash
eval "$(ssh-agent -s)" > /dev/null
if [ ! -d "/root/.ssh/" ]; then
mkdir /root/.ssh
ssh-keyscan $MANAGED_HOST > /root/.ssh/known_hosts
fi
ssh-add -k /run/secrets/id_rsa
If I run docker compose with the parameter command
bash -c "/root/init.sh && python3 /root/my_python.py", then the SSH authentication to the appropriate remote host ($MANAGED_HOST) is not working.
An agent process is running:
root 8 1 0 12:50 ? 00:00:00 ssh-agent -s
known_hosts is OK:
root#c67655d87ced:~# cat /root/.ssh/known_hosts
BLABLABLA ssh-rsa AAAAB3BLABLABLA....
and the agent is running, but the private key is not added:
root#c67655d87ced:~# ssh-add -l
Could not open a connection to your authentication agent.
Now, if I log in the container (docker exec -it server1 /bin/bash) and run the commands from init.sh one by one from the command line, then the SSH authentication to the appropriate remote host ($MANAGED_HOST) is working?!?
Any idea, how I can get it working by using the docker compose?
It should be enough to cause the file $HOME/.ssh/id_rsa to exist with appropriate permissions; you don't need an ssh agent running.
#!/bin/sh
if ! [ -d "$HOME/.ssh" ]; then
mkdir "$HOME/.ssh"
fi
chmod 0700 "$HOME/.ssh"
if [ -n "$MANAGED_HOST" ]; then
ssh-keyscan "$MANAGED_HOST" >> "$HOME/.ssh/known_hosts"
fi
if [ -f /run/secrets/id_rsa ]; then
cp /run/secrets/id_rsa "$HOME/.ssh/id_rsa"
chmod 0400 "$HOME/.ssh/id_rsa"
fi
# exec "$#"
A typical pattern is to use the Dockerfile ENTRYPOINT to do first-time setup tasks like this. That will get passed the CMD as arguments, and the commented exec "$#" line at the end of the file runs that as a command. You'd set this up in your image's Dockerfile like:
FROM XXXXXX
...
# Script must be executable on the host, and must start with a
# #!/bin/sh "shebang" line
COPY init.sh /root
# MUST use JSON-array form
ENTRYPOINT ["/root/init.sh"]
# Can use any Dockerfile syntax
CMD ["python3", "/root/my_python.py"]
In your specific example, you're launching init.sh as a subprocess. The ssh-agent setup sets some environment variables, like $SSH_AUTH_SOCK, but when these run as a subprocess they don't get propagated back out to the host process. You can use the standard POSIX shell . builtin (the bash source builtin is equivalent, but non-standard) to cause those environment variables to be set in the context of the parent shell:
command: sh -c ". /root/init.sh && exec python3 /root/my_python.py"
The exec replaces the shell wrapper with the Python script, which you generally want. This will also wind up being the parent process of ssh-agent, which could potentially surprise your process if it happens to exit.

Remove /bin/busybox at build time

For reasons such as hardening a Docker image, I want local users not be able to use wget. Since wget is a function of /bin/busybox, removal seems appropriate, even if a little drastic, and would apparently work at for and at runtime.
However RUN rm /bin/busybox will cause go stack traces, when run on top of Kubernetes or locally.
Is there any build time solution?
The example would be
FROM haproxy:1.6-alpine
RUN addgroup -S haproxy && adduser -S -g haproxy haproxy
RUN rm /bin/busybox
COPY haproxy.cfg /usr/local/etc/haproxy/haproxy.cfg
With the HAProxy's default ENTRYPOINT
#!/bin/sh
set -e
# first arg is `-f` or `--some-option`
if [ "${1#-}" != "$1" ]; then
set -- haproxy "$#"
fi
if [ "$1" = 'haproxy' ]; then
# if the user wants "haproxy", let's use "haproxy-systemd-wrapper" instead so we can have proper reloadability implemented by upstream
shift # "haproxy"
set -- "$(which haproxy-systemd-wrapper)" -p /run/haproxy.pid "$#"
fi
exec "$#"
If your /bin/sh is provided by busybox, and your entrypoint uses /bin/sh, then you need to delete busybox only after the entrypoint is started.
A simpler entrypoint definition, which doesn't require a shell, might look something like:
RUN ["rm", /bin/busybox"]
ENTRYPOINT ["haproxy-systemd-wrapper", "-p", "/run/haproxy.pid"]
CMD ["-f", "/usr/local/etc/haproxy/haproxy.cfg"]

Simple Shell Script: Dead Docker Containers

I have a very simple Docker container that runs a bash shell script that returns something. My Dockerfile:
# Docker image to get stats from a rest interface using CURL and JSON parsing
FROM ubuntu
RUN apt-get update
# Install curl and jq, a lightweight command-line JSON processor
RUN apt-get install -y curl jq
COPY ./stats.sh /
# Make sure script has execute permissions for root
RUN chmod 500 stats.sh
# Define a custom entrypoint to execute stats commands easily within the container,
# using environment substitution and the like...
ENTRYPOINT ["/stats.sh"]
CMD ["info"]
The stats.sh looks like this:
#!/bin/bash
# ElasticSearch
## Get the total size of the elasticsearch DB in bytes
## Requires the elasticsearch container to be linked with alias 'elasticsearch'
function es_size() {
local size=$(curl $ELASTICSEARCH_PORT_9200_TCP_ADDR:$ELASTICSEARCH_PORT_9200_TCP_PORT/_stats/_all 2>/dev/null|jq ._all.total.store.size_in_bytes)
echo $size
}
if [[ "$1" == "info" ]]; then
echo "Check stats.sh for available commands"
elif [[ "$1" == "es_size" ]]; then
es_size
else
echo "Unknown command: $#"
fi
So basically, I have a Docker container that I will run with --rm to exit immediately after running and returning the value I want. More precise, I run it from another shell script (in the host) with:
local size=$(docker run --name stats-es-size --rm --link $esName:elasticsearch $ENV_DOCKER_REST_STATS_IMAGE:$ENV_DOCKER_REST_STATS_VERSION es_size)
Now I'm running this periodically to gather statistics, once a minute. While it works well in general, I end up getting containers with status Dead about once a day.
Can anybody tell me what I might be doing wrong? Is there some problem with my approach or why do my containers die with a certain frequency?

Resources