running logstash as a dameon inside a docker container - docker

To be fair, all I wanted to do is have metricbeat send sys stats to elasticsearch and view them on kibana.
I read through elasticsearch docs, trying to find clues.
I am basing my image on python since my actual app is written in python, and my eventual goal is to send all logs (sys stats via metricbeat, and app logs via filebeat) to elastic.
I can't seem to find a way to run logstash as a service inside of a container.
my dockerfile:
FROM python:2.7
WORKDIR /var/local/myapp
COPY . /var/local/myapp
# logstash
RUN wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | apt-key add -
RUN apt-get update && apt-get install apt-transport-https dnsutils default-jre apt-utils -y
RUN echo "deb https://artifacts.elastic.co/packages/5.x/apt stable main" | tee -a /etc/apt/sources.list.d/elastic-5.x.list
RUN apt-get update && apt-get install logstash
# metricbeat
#RUN wget https://artifacts.elastic.co/downloads/beats/metricbeat/metricbeat-5.6.0-amd64.deb
RUN dpkg -i metricbeat-5.6.0-amd64.deb
RUN pip install --no-cache-dir -r requirements.txt
RUN apt-get autoremove -y
CMD bash strap_and_run.sh
and the extra script strap_and_run.sh:
python finalize_config.py
# start
echo "starting logstash..."
systemctl start logstash.service
#todo :get my_ip
echo "starting metric beat..."
/etc/init.d/metricbeat start
finalize_config.py
import os
import requests
LOGSTASH_PIPELINE_FILE = 'logstash_pipeline.conf'
LOGSTASH_TARGET_PATH = '/etc/logstach/conf.d'
METRICBEAT_FILE = 'metricbeat.yml'
METRICBEAT_TARGET_PATH = os.path.join(os.getcwd, '/metricbeat-5.6.0-amd64.deb')
my_ip = requests.get("https://api.ipify.org/").content
ELASTIC_HOST = os.environ.get('ELASTIC_HOST')
ELASTIC_USER = os.environ.get('ELASTIC_USER')
ELASTIC_PASSWORD = os.environ.get('ELASTIC_PASSWORD')
if not os.path.exists(os.path.join(LOGSTASH_TARGET_PATH)):
os.makedirs(os.path.join(LOGSTASH_TARGET_PATH))
# read logstash template file
with open(LOGSTASH_PIPELINE_FILE, 'r') as logstash_f:
lines = logstash_f.readlines()
new_lines = []
for line in lines:
new_lines.append(line
.replace("<elastic_host>", ELASTIC_HOST)
.replace("<elastic_user>", ELASTIC_USER)
.replace("<elastic_password>", ELASTIC_PASSWORD))
# write current file
with open(os.path.join(LOGSTASH_TARGET_PATH, LOGSTASH_PIPELINE_FILE), 'w+') as new_logstash_f:
new_logstash_f.writelines(new_lines)
if not os.path.exists(os.path.join(METRICBEAT_TARGET_PATH)):
os.makedirs(os.path.join(METRICBEAT_TARGET_PATH))
# read metricbeath template file
with open(METRICBEAT_FILE, 'r') as metric_f:
lines = metric_f.readlines()
new_lines = []
for line in lines:
new_lines.append(line
.replace("<ip-field>", my_ip)
.replace("<type-field>", "test"))
# write current file
with open(os.path.join(METRICBEAT_TARGET_PATH, METRICBEAT_FILE), 'w+') as new_metric_f:
new_metric_f.writelines(new_lines)

The reason is there is no init system inside the container. So you should not use service or systemctl. So you should yourself start the process in background. Your updated script would look like below
python finalize_config.py
# start
echo "starting logstash..."
/usr/bin/logstash &
#todo :get my_ip
echo "starting metric beat..."
/usr/bin/metric start &
wait
You will also need to add handling for TERM and other signal, and kill the child processes. If you don't do that docker stop will have few issues.
I prefer in such situation using a process manager like supervisord and run supervisor as the main PID 1.

Related

Recommended way to execute tests within Docker container

I would like to run set of specific tests within Docker container and not sure how to tackle this. Tests I want to perform are security-related, like create user(s), manage GPG keys for them and similar - which I am reluctant to run on PC running the tests.
I tried pytest-xdist/socketserver combo and also copying tests into running Docker container and use pytest-json-report to get result(s) as json saved to a volume shared with the host, but not sure this approach is good.
For now, I would settle with all tests (without mark or similar other features) are executed "remotely" (in Docker) and results are presented like everything is ran on local PC.
Don't mind writing a specific plugin, but not sure if this is a good way: do I have to make sure than, my plugin is loaded before say, pytest-xdist (or some others)? Additionally, if I use say, pytest_sessionstart in my conftest.py to build Docker image that I can then target with xdist; but my tests have also some "dependency" that I have to put within conftest.py - I cant use same conftest.py within container and in my PC running the test.
Thank you in advance
In case anyone else maybe have similar need, I will share what I did.
First of all, there is already an excellent pytest-json-report to export JSON results. However, I made simpler and with less functionality plugin that uses pytest_report_to_serializable directly:
import json
from socket import gethostname
def pytest_addoption(parser, pluginmanager):
parser.addoption(
'--report-file', default='%s.json' % gethostname(), help='path to JSON report'
)
def pytest_configure(config):
plugin = JsonTestsExporter(config=config)
config._json_report = plugin
config.pluginmanager.register(plugin)
def pytest_unconfigure(config):
plugin = getattr(config, '_json_report', None)
if plugin is not None:
del config._json_report
config.pluginmanager.unregister(plugin)
print('Report saved in: %s' % config.getoption('--report-file'))
class JsonTestsExporter(object):
def __init__(self, config):
self._config = config
self._export_data = {'collected': 0, 'results': []}
def pytest_report_collectionfinish(self, config, start_path, startdir, items):
self._export_data['collected'] = len(items)
def pytest_runtest_logreport(self, report):
data = self._config.hook.pytest_report_to_serializable(
config=self._config, report=report
)
self._export_data['results'].append(data)
def pytest_sessionfinish(self, session):
report_file = self._config.getoption('--report-file')
with open(report_file, 'w+') as fd:
fd.write(json.dumps(self._export_data))
Reason beyond this is that I wanted results also imported using pytest_report_from_serializable.
Simplified Dockerfile:
FROM debian:buster-slim AS builder
COPY [ "requirements.txt", "run.py", "/artifacts/" ]
COPY [ "json_tests_exporter", "/artifacts/json_tests_exporter/" ]
RUN apt-get update\
# install necesssary packages
&& apt-get install --no-install-recommends -y python3-pip python3-setuptools\
# build json_tests_exporter *.whl
&& pip3 install wheel\
&& sh -c 'cd /artifacts/json_tests_exporter && python3 setup.py bdist_wheel'
FROM debian:buster-slim
ARG USER_UID=1000
ARG USER_GID=${USER_UID}
COPY --from=builder --chown=${USER_UID}:${USER_GID} /artifacts /artifacts
RUN apt-get update\
# install necesssary packages
&& apt-get install --no-install-recommends -y wget gpg openssl python3-pip\
# create user to perform tests
&& groupadd -g ${USER_GID} pytest\
&& adduser --disabled-password --gecos "" --uid ${USER_UID} --gid ${USER_GID} pytest\
# copy/install entrypoint script and preserver permissions
&& cp -p /artifacts/run.py /usr/local/bin/run.py\
# install required Python libraries
&& su pytest -c "pip3 install -r /artifacts/requirements.txt"\
&& su pytest -c "pip3 install /artifacts/json_tests_exporter/dist/*.whl"\
# make folder for tests and results
&& su pytest -c "mkdir -p /home/pytest/tests /home/pytest/results"
VOLUME [ "/home/pytest/tests", "/home/pytest/results" ]
USER pytest
WORKDIR /home/pytest/tests
ENTRYPOINT [ "/usr/local/bin/run.py" ]
JSON exporter plugin is located in same folder as Dockerfile
run.py is as simple as:
#!/usr/bin/python3
import pytest
import sys
from socket import gethostname
def main():
if 1 == len(sys.argv):
# use default arguments
args = [
'--report-file=/home/pytest/results/%s.json' % gethostname(),
'-qvs',
'/home/pytest/tests'
]
else:
# caller passed custom arguments
args = sys.argv[1:]
try:
res = pytest.main(args)
except Exception as e:
print(e)
res = 1
return res
if __name__ == "__main__":
sys.exit(main())
requirements.txt only contains:
python-gnupg==0.4.4
pytest>=7.1.2
So basically, I can run everything with:
docker build -t pytest-runner ./tests/docker/pytest_runner
docker run --rm -it -v $(pwd)/tests/results:/home/pytest/results -v $(pwd)/tests/fixtures:/home/pytest/tests pytest-runner
Last two lines I made programatically run from Python in pytest_sessionstart(session) hook using Docker API.

Error when starting custom Airflow Docker Image GROUP_OR_COMMAND

I created a custom image with the following Dockerfile:
FROM apache/airflow:2.1.1-python3.8
USER root
RUN apt-get update \
&& apt-get -y install gcc gnupg2 \
&& curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add - \
&& curl https://packages.microsoft.com/config/debian/10/prod.list > /etc/apt/sources.list.d/mssql-release.list
RUN apt-get update \
&& ACCEPT_EULA=Y apt-get -y install msodbcsql17 \
&& ACCEPT_EULA=Y apt-get -y install mssql-tools
RUN echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bashrc \
&& echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bashrc \
&& source ~/.bashrc
RUN apt-get -y install unixodbc-dev \
&& apt-get -y install python-pip \
&& pip install pyodbc
RUN echo -e “AIRFLOW_UID=$(id -u) \nAIRFLOW_GID=0” > .env
USER airflow
The image creates successfully, but when I try to run it, I get this error:
"airflow command error: the following arguments are required: GROUP_OR_COMMAND, see help above."
I have tried supplying a group ID with the --user, but I can't figure it out.
How can I start this custom Airflow Docker image?
Thanks!
First of all this line is wrong:
RUN echo -e “AIRFLOW_UID=$(id -u) \nAIRFLOW_GID=0” > .env
If you are running it with Docker Compose (I presume you took it from https://airflow.apache.org/docs/apache-airflow/stable/start/docker.html), this is something you should run on "Host" machine, not in the image. Remove that line, it has no effect.
Secondly - it really depends what "command" you run. The "GROUP_OR_COMMAND" message you got is the output of "airflow" command. You have not copied the whole output of your command but this is a message you get when you try to run airflow without telling it what to do. When you run the image you will run by default the airflow command which has a number of subcommands that can be executed. So the "see help above" message tells you the very thing you should do - look at the help and see what subcommand you wanted to run (and possibly run it).
docker run -it apache/airflow:2.1.2
usage: airflow [-h] GROUP_OR_COMMAND ...
positional arguments:
GROUP_OR_COMMAND
Groups:
celery Celery components
config View configuration
connections Manage connections
dags Manage DAGs
db Database operations
jobs Manage jobs
kubernetes Tools to help run the KubernetesExecutor
pools Manage pools
providers Display providers
roles Manage roles
tasks Manage tasks
users Manage users
variables Manage variables
Commands:
cheat-sheet Display cheat sheet
info Show information about current Airflow and environment
kerberos Start a kerberos ticket renewer
plugins Dump information about loaded plugins
rotate-fernet-key
Rotate encrypted connection credentials and variables
scheduler Start a scheduler instance
sync-perm Update permissions for existing roles and optionally DAGs
version Show the version
webserver Start a Airflow webserver instance
optional arguments:
-h, --help show this help message and exit
airflow command error: the following arguments are required: GROUP_OR_COMMAND, see help above.
when you extend the official image, it will pass the parametor to "airflow" command which causing this problem. Check this out: https://airflow.apache.org/docs/docker-stack/entrypoint.html#entrypoint-commands

Dockerfile Cassandra - /usr/bin/env: ‘python3\r’: No such file or directory

I can't start my cassandra container, I get the following error when cassandra container is starting:
/usr/bin/env: ‘python3\r’: No such file or directory
My Dockerfile:
FROM cassandra:3.11.6
RUN apt-get update && apt-get install -y apt-transport-https && apt-get install software-properties-common -y
COPY ["schema.cql", "wait-for-it.sh", "bootstrap-schema.py", "/"]
RUN chmod +x /bootstrap-schema.py /wait-for-it.sh
ENV BOOTSTRAPPED_SCHEMA_FILE_MARKER /bootstrapped-schema
ENV BOOTSTRAP_SCHEMA_ENTRYPOINT /bootstrap-schema.py
ENV OFFICIAL_ENTRYPOINT /docker-entrypoint.sh
# 7000: intra-node communication
# 7001: TLS intra-node communication
# 7199: JMX
# 9042: CQL
# 9160: thrift service
EXPOSE 7000 7001 7199 9042 9160
#Change entrypoint to custom script
COPY cassandra.yaml /etc/cassandra/cassandra.yaml
ENTRYPOINT ["/bootstrap-schema.py"]
CMD ["cassandra", "-Dcassandra.ignore_dc=true", "-Dcassandra.ignore_rack=true", "-f"]
I GOT THIS ERROR ONLY WHEN I ATTACH THIS LINE:
ENTRYPOINT ["/bootstrap-schema.py"]
I use Windows 10 (Docker for Windows installed).
What's wrong in this script: bootstrap-schema.py:
#!/usr/bin/env python3
import os
import sys
import subprocess
import signal
import logging
logger = logging.getLogger('bootstrap-schema')
logger.setLevel(logging.DEBUG)
ch = logging.StreamHandler()
ch.setLevel(logging.DEBUG)
logger.addHandler(ch)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
ch.setFormatter(formatter)
proc_args = [os.environ['OFFICIAL_ENTRYPOINT']]
proc_args.extend(sys.argv[1:])
if (not os.path.exists(os.environ["BOOTSTRAPPED_SCHEMA_FILE_MARKER"])):
proc = subprocess.Popen(proc_args) # Run official entrypoint command as child process
wait_for_cql = os.system("/wait-for-it.sh -t 120 127.0.0.1:9042") # Wait for CQL (port 9042) to be ready
if (wait_for_cql != 0):
logger.error("CQL unavailable")
exit(1)
logger.debug("Schema creation")
cqlsh_ret = subprocess.run("cqlsh -f /schema.cql 127.0.0.1 9042", shell=True)
if (cqlsh_ret.returncode == 0):
# Terminate bg process
os.kill(proc.pid, signal.SIGTERM)
proc.wait(20)
# touch file marker
open(os.environ["BOOTSTRAPPED_SCHEMA_FILE_MARKER"], "w").close()
logger.debug("Schema created")
else:
logger.error("Schema creation error. {}".format(cqlsh_ret))
exit(1)
else:
logger.debug("Schema already exists")
os.execv(os.environ['OFFICIAL_ENTRYPOINT'], sys.argv[1:]) # Run official entrypoint
Thanks for any tip
EDIT
Of course I tried to add ex.
RUN apt-get install python3
OK, my fault - there was a well known problem - encoding. I had to encode windows files to Linux files - EACH file, also scripts, everything. Now works excellent:)

Docker ubuntu cron tail logs not visible

Trying to run a docker container that has a cron scheduling. However I cannot make it output logs.
Im using docker-compose.
docker-compose.yml
---
version: '3'
services:
cron:
build:
context: cron/
container_name: ubuntu-cron
cron/Dockerfile
FROM ubuntu:18.10
RUN apt-get update
RUN apt-get update && apt-get install -y cron
ADD hello-cron /etc/cron.d/hello-cron
# Give execution rights on the cron job
RUN chmod 0644 /etc/cron.d/hello-cron
# Create the log file to be able to run tail
RUN touch /var/log/cron.log
# Run the command on container startup
CMD cron && tail -F /var/log/cron.log
cron/hello-cron
* * * * * root echo "Hello world" >> /var/log/cron.log 2>&1
The above runs fine its outputting logs inside the container however they are not streamed to the docker.
e.g.
docker logs -f ubuntu-cron returns empty results
but
if you login to the container docker exec -it -i ubuntu-cron /bin/bash you have logs.
cat /var/log/cron.log
Hello world
Hello world
Hello world
Now Im thinking that maybe I dont need to log to a file? could attach this to sttoud but not sure how to do this.
This looks similar...
How to redirect cron job output to stdout
I tried your setup and the following Dockerfile works:
FROM ubuntu:18.10
RUN apt-get update
RUN apt-get update && apt-get install -y cron
ADD hello-cron /etc/cron.d/hello-cron
# Give execution rights on the cron job
RUN chmod 0755 /etc/cron.d/hello-cron
# Create the log file to be able to run tail
RUN touch /var/log/cron.log
# Symlink the cron to stdout
RUN ln -sf /dev/stdout /var/log/cron.log
# Run the command on container startup
CMD cron && tail -F /var/log/cron.log 2>&1
Also note that I'm bringing the container up with "docker-compose up" rather than docker. It wouldn't matter in this particular example, but if your actual solution is bigger it might matter.
EDIT: Here's the output when I run docker-compose up:
neekoy#synchronoss:~$ sudo docker-compose up
Starting ubuntu-cron ... done
Attaching to ubuntu-cron
ubuntu-cron | Hello world
ubuntu-cron | Hello world
ubuntu-cron | Hello world
Same in the logs obviously:
neekoy#synchronoss:~$ sudo docker logs daf0ff73a640
Hello world
Hello world
Hello world
Hello world
Hello world
My understanding is that the above is the goal.
Due to some weirdness in the docker layers and inodes, you have to create the file during the CMD:
CMD cron && touch /var/log/cron.log && tail -F /var/log/cron.log
This works both for file and stdout:
FROM ubuntu:18.10
RUN apt-get update
RUN apt-get update && apt-get install -y cron
ADD hello-cron /etc/cron.d/hello-cron
# Give execution rights on the cron job
RUN chmod 0644 /etc/cron.d/hello-cron
# Create the log file to be able to run tail
# Run the command on container startup
CMD cron && touch /var/log/cron.log && tail -F /var/log/cron.log
The explanation seems to be this one:
In the original post tail command starts "listening" to a file which is in a layer of the image, then when cron writes the first line to that file, docker copies the file to a new layer, the container layer (because of the nature of copy-and-write filesystem, the way that docker works). So when the file gets created in a new layer it gets a different inode and tail keeps listening in the previous state, so looses every update to the "new file". Credits BMitch
Try to redirect on this > /dev/stdout, after this you should see your logs with a docker logs.

docker root crontab job not executing

I have an Ubuntu 14.04 docker image that I want to schedule a python script within to execute every minute. My DockerFile contains CMD ["cron","-f"] in order to start the cron daemon. The crontab entry looks like this:
0,1 * * * * root python /opt/com.org.project/main.py >> /opt/com.org.project/var/log/cron.log
/opt/com.org.project/main.py is completely accessible and owned by root and has 744 privileges set; so can be executed.
Nothing is showing up in my /opt/com.org.project/var/log/cron.log file, nor /var/log/cron.log. Yet ps aux | grep cron shows cron -f running at PID 1.
What am I missing? Why is my cron job not running within the container?
Here are my DockerFile contents as requested:
FROM ubuntu
# Update the os and install the dependencies needed for the container
RUN apt-get update \
&& apt-get install -y \
nano \
python \
python-setuptools \
python-dev \
xvfb \
firefox
# Install PIP for python package management
RUN easy_install pip
CMD ["cron", "-f"]
Why use cron? Just write a shell script like this:
#!/bin/bash
while true; do
python /opt/com.org.project/main.py >> /opt/com.org.project/var/log/cron.log
sleep 60
done
Then just set it as entrypoint.
ENTRYPOINT ["/bin/bash", "/loop_main.sh" ]
Where did you use crontab -e? On the host running docker or in the container itself?
I can't see that you are adding an crontab entry in the dockerfile you provided. I recommend you to add an external crontab file like this:
ADD crontabfile /app/crontab
RUN crontab /app/crontab
CMD ["cron", "-f"]
The file crontabfile has to be located next to Dockerfile.
image_folder
|
|- Dockerfile
|- crontabfile
Example content of crontabfile:
# m h dom mon dow command
30 4 * * * /app/myscript.py

Resources