Application logging with multiple docker replicas (containers)

Application logging with multiple docker replicas (containers) - docker

We have a .NET core app, which logs its output to files, for eg. portal-20200430-000.log. In DEV environment, all is well :)
App is deployed via docker service, which initializes 3 replicas - 3 docker containers. We want to have all the logs from all the containers (replicas) in one place, so we mapped the file systems between the host machine and the containers via volumes.
docker-compose.yml:
version: "3.7"
services:
web:
image: portal:0.0.1
--- snipped content ---
volumes:
- "/home/portal/Dev/Logs:/app/Logs"
deploy:
replicas: 3
Each container (replica) is outputting its own logs to /app/Logs/portal-20200430-000.log inside container, but this folder is mapped to /home/portal/Dev/Logs on the host. So all 3 containers are writing into the same file on the host, which is not ok - some of the logs get lost, these 3 containers are overwritting each others logs, etc.
I suppose possible solutions are:
change the file name of each container's log (but logging is done via external Karambolo logger, that has filenames hardcoded inside appsettings.json. These settings are common to all container replicas)
instruct each docker replica to map a different volume - is that even possible?
Is there another solution?

Note - This is a partial solution.
When you start docker-compose with replicas, the only difference inside the container is environment variable hostname (unless set by docker-compose.yml to a static value).
Create docker-compose.yml as below:-
version: '3'
services:
test_logging:
image: bash
entrypoint: bash
command: -c "sleep 3600"
deploy:
replicas: 2
Run the containers:-
docker-compose up -d
Now, if you execute bash in interactive mode in the running container, you will see following environment variables.
$ docker exec -it test_test_logging_1 bash
bash-5.1# env
HOSTNAME=f000d941eab2
PWD=/
_BASH_BASELINE_PATCH=16
HOME=/root
_BASH_VERSION=5.1.16
_BASH_BASELINE=5.1.16
_BASH_LATEST_PATCH=16
TERM=xterm
SHLVL=1
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
_=/usr/bin/env
$ docker exec -it test_test_logging_2 bash
bash-5.1# env
HOSTNAME=0b848ef70202
PWD=/
_BASH_BASELINE_PATCH=16
HOME=/root
_BASH_VERSION=5.1.16
_BASH_BASELINE=5.1.16
_BASH_LATEST_PATCH=16
TERM=xterm
SHLVL=1
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
As you can see, only hostname differs between different docker containers. You can use hostname to generate different filenames for different log files.
Disk Space Issue
However, this wont be enough. By default FileHandler (logging) will use as much space as needed. Its best to move to RotatingFileHandler or TimedRotatingFileHandler. Note that, its better to use RotatingFileHandler due to sudden surge of too many error logs which can happen due to any number of reasons. By the time, TimedRotatingFileHandler will rotate the files, you might end up using all the disk space.
Note that RotatingFileHandler doesnt provide compression of rotated log files, this needs to be additionally implemented.
P.S. Be aware that with R replicas, and each replica using B backupCount files for RotatingFileHandler and S maxBytes, you will use R x B x S
Bytes disk space with no compression.
Finally, this is a partial solution, since if you do a restart of the service, the hostname will change, and new log files will be created. However, even with this issue, the above partial solution will work if you integrate log services like ELK stack to collect older log files for further analysis of log files such as error monitoring.

Related

Docker editing entrypoint of existing container

I've docker container build from debian:latest image.
I need to execute a bash script that will start several services.
My host machine is Windows 10 and I'm using Docker Desktop, I've found configuration files in
docker-desktop-data wsl2 drive in data\docker\containers\<container_name>
I've 2 config files there:
config.v2.json and hostcongih.json
I've edited the first of them and replaced:
"Entrypoint":null with "Entrypoint":["/bin/bash", "/opt/startup.sh"]
I have done it while the container was down, when I restarted it the script was not executed. When I opened config.v2.json file again the Entrypoint was set to null again.
I need to run this script at every container start.
Additional strange thing is that this container doesn't have any volume appearing in docker desktop. I can checkout this container and start another one, but I need to preserve current state of this container (installed packages, files, DB content). How can I change the entrypoint or run the script in other way?
Is there anyway to export the container to image alongside with it's configuration? I need to expose several ports and run the startup script. Is there anyway to make every new container made from the image exported from current container expose the same ports and run same startup script?

Docker's typical workflow involves containers that only run a single process, and are intrinsically temporary. You'd almost never create a container, manually set it up, and try to persist it; instead, you'd write a script called a Dockerfile that describes how to create a reusable image, and then launch some number of containers from that.
It's almost always preferable to launch multiple single-process containers than to try to run multiple processes in a single container. You can use a tool like Docker Compose to describe the multiple containers and record the various options you'd need to start them:
# docker-compose.yml
# Describe the file version. Required with the stable Python implementation
# of Compose. Most recent stable version of the file format.
version: '3.8'
# Persistent storage managed by Docker; will not be accessible on the host.
volumes:
dbdata:
# Actual containers.
services:
# The database.
db:
# Use a stock Docker Hub image.
image: postgres:15
# Persist its data.
volumes:
- dbdata:/var/lib/postgresql/data
# Describe how to set up the initial database.
environment:
POSTGRES_PASSWORD: passw0rd
# Make the container accessible from outside Docker (optional).
ports:
- '5432:5432' # first port any available host port
# second port MUST be standard PostgreSQL port 5432
# Reverse proxy / static asset server
nginx:
image: nginx:1.23
# Get static assets from the host system.
volumes:
- ./static:/usr/share/nginx/html
# Make the container externally accessible.
ports:
- '8000:80'
You can check this file into source control with your application. Also consider adding a third container that build: an image containing the actual application code; that probably will not have volumes:.
docker-compose up -d will start this stack of containers (without -d, in the foreground). If you make a change to the docker-compose.yml file, re-running the same command will delete and recreate containers as required. Note that you are never running an unmodified debian image, nor are you manually running commands inside a container; the docker-compose.yml file completely describes the containers, their startup sequences (if not already built into the images), and any required runtime options.
Also see Networking in Compose for some details about how to make connections between containers: localhost from within a container will call out to that same container and not one of the other containers or the host system.

How to determine the number of replicas of scaled service

I have a docker-compose file that exposes 2 services, a master service and a slave service. I want to be able to scale the slave service to some number of instances using
docker-compose up --scale slave=N
However, one of the options I must specify on command run in the master service is the number of slave instances to expect. E.g. If I scale slave=10, I need to set --num-slaves=10 in the command on the master service.
Is there a way to determine the number of instances of a given service either from the docker-compose file itself, or from a customized entrypoint shellscript?
The problem I'm facing is that since there is no way I've yet found to specify the number of scaled instances from within the docker-compose file format itself, I'm relying on the person running the command to enter the scale factor consistently and to have that value align with the value I need to tell the master node to expect. And trusting users to do the right thing is a recipe for disaster. If I could continue to let the user specify the scale value on the command line, I need a way to determine what that value is at runtime.

scale is not added to up from compose version 3 but you may use replicas:
version: "3.7"
services:
redis:
image: redis:latest
deploy:
replicas: 1
and run it using:
docker-compose --compatibility up -d
docker-compose 1.20.0 introduces a new --compatibility flag designed
to help developers transition to version 3 more easily. When enabled,
docker-compose reads the deploy section of each service’s definition
and attempts to translate it into the equivalent version 2 parameter.
Currently, the following deploy keys are translated:
resources limits and memory reservations
replicas
restart_policy condition and max_attempts
but:
Do not use this in production!
We recommend against using --compatibility mode in production. Because
the resulting configuration is only an approximate using non-Swarm
mode properties, it may produce unexpected results.
see this
PS:
Docker container names must be unique you cannot scale a service beyond 1 container if you have specified a
custom name. Attempting to do so results in an error.

Unfortunately there is no way to define replicas for docker compose. IT ONLY WORKS FOR DOCKER SWARM The documentation specifies it link
Tip: Alternatively, in Compose file version 3.x, you can specify replicas under the deploy key as part of a service configuration for Swarm mode. The deploy key and its sub-options (including replicas) only works with the docker stack deploy command, not docker-compose up or docker-compose run.
So if you have the deploy section in the yaml, but run it with docker-compose, then it will not take any effect.
version: "3.3"
services:
alpine1:
image: alpine
container_name: alpine1
command: ["/bin/sleep", "10000"]
deploy:
replicas: 4
alpine2:
image: alpine
container_name: alpine2
command: ["/bin/sleep", "10000"]
deploy:
replicas: 2
So the only way to scale up in docker compose is by running the scale command manually.
docker-compose scale alpine1=3
Note I had a job in which they loved docker-compose so we had bash scripts to perform operations such as the ones you describe. So for example we would have something like ./controller-app.sh scale test_service=10 and it would run docker-compose scale test_service=10
UPDATE
To check the number of replicas you can mount the docker socket into your container. Then run docker ps --format {{ .Names }} | grep $YOUR_CONTAINER_NAME.
Here is how you would mount the socket.
docker run -v /var/run/docker.sock:/var/run/docker.sock -it alpine sh
Install docker
apk update
apk add docker

Putting file into HDFS using docker-compose

Is there a way to put some file, let's say data.json, into HDFS automatically right from Docker-compose/Dockerfile?
When I start namenode and datanode I can enter into containers with
docker exec -it namenode [datanode] bash, and use
hdfs dfs -put data.json hdfs:/ (when safe mode is finished)
and that works, but I need a way to run this automatically. When I try to build containers from Dockerfile and put comands:
FROM bde2020/hadoop-namenode:1.1.0-hadoop2.8-java8
WORKDIR /data
ADD hdfs_writer/data.json /data
# ADD python_script.py /data
CMD ["hdfs dfsadmin -safemode wait && hdfs dfs -put ./data.json hdfs:/"]
# CMD ["python python_script.py"]
Container namenode immediately terminates. I also tried with the python script, that I add to container and run it with CMD.
python_script
import time
import os
os.system("hdfs dfsadmin -safemode wait")
os.system("hdfs dfs -put -f data.json hdfs:/")
while True:
time.sleep(5)
in that case, container is running, but if I check logs and try to list hdfs with hdfs dfs -ls hdfs:/, there is following error
safemode: Call From 662aae005e8b/172.20.0.5 to namenode:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
19/04/18 14:36:36 WARN ipc.Client: Failed to connect to server: namenode/172.20.0.5:8020: try once and fail.
I read recommended link from error log, and to be honest, I am not sure that I understand what should I do.
Any your suggestions or ideas about possible solution is highly valuable for me, as I am new to this field and I don't have much experience.
If you need some more info, I will be happy to provide it.
docker-compose.yml (just part of it)
namenode:
#docker-compose.yml and Dockerfile are in the dame directory
build: .
volumes:
- ./data/namenode:/hadoop/dfs/name
environment:
- CLUSTER_NAME=cluster
env_file:
- ./hadoop.env
ports:
- 50070:50070
datanode:
image: bde2020/hadoop-datanode:1.1.0-hadoop2.8-java8
depends_on:
- namenode
volumes:
- ./data/datanode:/hadoop/dfs/data
env_file:
- ./hadoop.env
hadoop.env
CORE_CONF_fs_defaultFS=hdfs://namenode:8020
CORE_CONF_hadoop_http_staticuser_user=root
CORE_CONF_hadoop_proxyuser_hue_hosts=*
CORE_CONF_hadoop_proxyuser_hue_groups=*
HDFS_CONF_dfs_webhdfs_enabled=true
HDFS_CONF_dfs_permissions_enabled=false
HDFS_CONF_dfs_blocksize=1m
YARN_CONF_yarn_log___aggregation___enable=true
YARN_CONF_yarn_resourcemanager_recovery_enabled=true
YARN_CONF_yarn_resourcemanager_store_class=org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore
YARN_CONF_yarn_resourcemanager_fs_state___store_uri=/rmstate
YARN_CONF_yarn_nodemanager_remote___app___log___dir=/app-logs
YARN_CONF_yarn_log_server_url=http://historyserver:8188/applicationhistory/logs/
YARN_CONF_yarn_timeline___service_enabled=true
YARN_CONF_yarn_timeline___service_generic___application___history_enabled=true
YARN_CONF_yarn_resourcemanager_system___metrics___publisher_enabled=true
YARN_CONF_yarn_resourcemanager_hostname=resourcemanager
YARN_CONF_yarn_timeline___service_hostname=historyserver
YARN_CONF_yarn_resourcemanager_address=resourcemanager:8032
YARN_CONF_yarn_resourcemanager_scheduler_address=resourcemanager:8030
YARN_CONF_yarn_resourcemanager_resource__tracker_address=resourcemanager:8031

You can't write to networked services in a Dockerfile. Imagine running docker build, running your combined application, tearing it down, and running it again. You'll reuse the same built image without re-running the Dockerfile steps; only the content in the image itself is kept. In most cases you need some minor amount of setup to communicate between services (Docker Compose can do this for you) but that is not set up during a build sequence. This is the same answer as "you can't run database migrations from a Dockerfile", but it applies equally to Hadoop.
A container only does one thing. Your sample Dockerfile sets a different CMD that waits for the namenode to be running and sets it up. This happens instead of starting the namenode process. A Docker container runs one main command and one main command only; there is not a way to run a main command and also a side support script of some form. The container you show would probably work, but you'd need to run it as a separate container alongside the namenode container.
You don't need to be "in Docker" to access Docker-hosted services. You can use a Docker Compose ports: directive to make services visible to the host, at which point you can use ordinary clients to interact with them. The docker exec path is the equivalent of "I ssh to my server as root, and then...", which isn't how you normally deal with any service at all.
Your server containers should only run servers. In your example you're both trying to launch an HDFS namenode and also populate the server from the same container; you'd be better off having the namenode container only be the namenode and running the setup job from another container or from the host. (See the standard postgres image's entrypoint script for some idea of the gyrations required otherwise.)
Docker Compose isn't great for one-off jobs. Every time you run docker-compose up it will discover that your setup container isn't running and try to start it again. Other more powerful orchestrators could be a better fit; for example, a Kubernetes Job is a reasonable fit for what you're describing.

Why is my Docker volume not working in a remote build box?

I am attempting to add a volume to a Docker container that will be built and run in a Docker Compose system on a hosted build service (CircleCI). It works fine locally, but not remotely. CircleCI provide an SSH facility I can use to debug why a container is not behaving as expected.
The relevant portion of the Docker Compose file is thus:
missive-mongo:
image: missive-mongo
command: mongod -v --logpath /var/log/mongodb/mongodb.log --logappend
volumes:
- ${MONGO_LOCAL}:/data/db
- ${LOGS_LOCAL_PATH}/mongo:/var/log/mongodb
networks:
- storage_network
Locally, if I do docker inspect integration_missive-mongo_1 (i.e. the running container name, I will get the volumes as expected:
...
"HostConfig": {
"Binds": [
"/tmp/missive-volumes/logs/mongo:/var/log/mongodb:rw",
"/tmp/missive-volumes/mongo:/data/db:rw"
],
...
On the same container, I can shell in and see that the volume works fine:
docker exec -it integration_missive-mongo_1 sh
/ # tail /var/log/mongodb/mongodb.log
2017-11-28T22:50:14.452+0000 D STORAGE [initandlisten] admin.system.version: clearing plan cache - collection info cache reset
2017-11-28T22:50:14.452+0000 I INDEX [initandlisten] build index on: admin.system.version properties: { v: 2, key: { version: 1 }, name: "incompatible_with_version_32", ns: "admin.system.version" }
2017-11-28T22:50:14.452+0000 I INDEX [initandlisten] building index using bulk method; build may temporarily use up to 500 megabytes of RAM
2017-11-28T22:50:14.452+0000 D INDEX [initandlisten] bulk commit starting for index: incompatible_with_version_32
2017-11-28T22:50:14.452+0000 D INDEX [initandlisten] done building bottom layer, going to commit
2017-11-28T22:50:14.454+0000 I INDEX [initandlisten] build index done. scanned 0 total records. 0 secs
2017-11-28T22:50:14.455+0000 I COMMAND [initandlisten] setting featureCompatibilityVersion to 3.4
2017-11-28T22:50:14.455+0000 I NETWORK [thread1] waiting for connections on port 27017
2017-11-28T22:50:14.455+0000 D COMMAND [PeriodicTaskRunner] BackgroundJob starting: PeriodicTaskRunner
2017-11-28T22:50:14.455+0000 D COMMAND [ClientCursorMonitor] BackgroundJob starting: ClientCursorMonitor
OK, now for the remote. I kick off a build, it fails because Mongo won't start, so I use the SSH facility that keeps a box alive after a failed build.
I first hack the DC file so that it does not try to launch Mongo, as it will fail. I just get it to sleep instead:
missive-mongo:
image: missive-mongo
command: sleep 1000
volumes:
- ${MONGO_LOCAL}:/data/db
- ${LOGS_LOCAL_PATH}/mongo:/var/log/mongodb
networks:
- storage_network
I then run the docker-compose up script to bring all containers up, and then examine the problematic box: docker inspect integration_missive-mongo_1:
"HostConfig": {
"Binds": [
"/tmp/missive-volumes/logs/mongo:/var/log/mongodb:rw",
"/tmp/missive-volumes/mongo:/data/db:rw"
],
That looks fine. So on the host I create a dummy log file, and list it to prove it is there:
bash-4.3# ls /tmp/missive-volumes/logs/mongo
mongodb.log
So I try shelling in, docker exec -it integration_missive-mongo_1 sh again. This time I find that the folder exists, but not the volume contents:
/ # ls /var/log
mongodb
/ # ls /var/log/mongodb/
/ #
This is very odd, because the reliability of volumes in the remote Docker/Compose config has been exemplary up until now.
Theories
My main one at present is that the differing versions of Docker and Docker Compose could have something to do with it. So I will list out what I have:
Local
Host: Linux Mint
Docker version 1.13.1, build 092cba3
docker-compose version 1.8.0, build unknown
Remote
Host: I suspect it is Alpine (it uses apk for installing)
I am using the docker:17.05.0-ce-git image supplied by CircleCI, the version shows as Docker version 17.05.0-ce, build 89658be
Docker Composer is installed via Pip, and getting the version produces docker-compose version 1.13.0, build 1719ceb.
So, there is some version discrepancy. As a shot in the dark, I could try bumping up Docker/Compose, though I am wary of breaking other things.
What would be ideal though, is some sort of advanced Docker commands I can use to debug why the volume appears to be registered but is not exposed inside the container. Any ideas?

CircleCI runs docker-compose remotely from the Docker daemon so local bind mounts don't work.
A named volume will default to the local driver and would work in CircleCI's Compose setup, the volume will exist where ever the container runs.
Logging should generally be left to stdout and stderr in a single process per container setup. Then you can make use of a logging driver plugin to ship to a central collector. MongoDB defaults to logging to stdout/stderr when run in the foreground.
Combining the volumes and logging:
version: "2.1"
services:
syslog:
image: deployable/rsyslog
ports:
- '1514:1514/udp'
- '1514:1514/tcp'
mongo:
image: mongo
command: mongod -v
volumes:
- 'mongo_data:/data/db'
depends_on:
- syslog
logging:
options:
tag: '{{.FullID}} {{.Name}}'
syslog-address: "tcp://10.8.8.8:1514"
driver: syslog
volumes:
mongo_data:
This is a little bit of a hack as the logging endpoint would normally be external, rather than a container in the same group. This is why the logging uses the external address and port mapping to access the syslog server. This connection is between the docker daemon and the log server, rather than container to container.

I wanted to add an additional answer to accompany the accepted one. My use-case on CircleCI is to run browser-based integration tests, in order to check that a whole stack is working correctly. A number of the 11 containers in use have volumes defined for various things, such as log output and raw database file storage.
What I had not realised until now was that the volumes in CircleCI's Docker executor do not work, as a result of a technical Docker limitation. As a result of this failure, in each case previously, the files were just written to an empty folder.
In my new case however, this issue was causing Mongo to fail. The reason for that was that I'm using --logappend to prevent Mongo from doing its own log rotation on start-up, and this switch requires the path specified in --logpath to exist. Since it existed on the host, but the volume creation failed, the container could not see the log file.
To fix this, I have modified my Mongo service entry to call a script in the command section:
missive-mongo:
image: missive-mongo
command: sh /root/mongo-logging.sh
And the script looks like this:
#!/bin/sh
#
# The command sets up logging in Mongo. The touch is for the benefit of any
# environment in which the logs do not already exist (e.g. Integration, since
# CircleCI does not support volumes)
touch /var/log/mongodb/mongodb.log \
&& mongod -v --logpath /var/log/mongodb/mongodb.log --logappend
In the two possible use cases, this will act as follows:
In the case of the mount working (dev, live) it will simply touch a file if it exists, and create it if it does not (e.g. a completely new environment),
In the case of the mount not working (CircleCI) it will create the file.
Either way, this is a nice safety feature to prevent Mongo blowing up.

Docker-compose: how to start a container with output supressed

I have a docker-compose file that spins up, among several other, a couchdb container (https://hub.docker.com/r/klaemo/couchdb/); and the couchdb container spews out a lot of output when I do the docker-compose up. Is there a way to suppress that output so I see only other containers' s output?
Maybe
I can run the couchdb in daemon mode somehow?
or
I can override the default command somehow and redirect output to a tmp file?
I am not sure how to do any of the two, and I want to do that within the compose file itself, not by changing my compose file callup command. Any help?
Here is the minimal compose file:
couchdb:
container_name: couchdb
image: klaemo/couchdb:2.0.0
ports:
- "5984:5984"
and I call that from a makefile with : docker-compose up --abort-on-container-exit --force-recreate && docker-compose down

Note that Docker containers log to stdout and stderr for a reason. It allows a consistent log interface for commands like docker logs to use and for logging drivers to pick up information from containers. In a large container eco system, it's easier if everything works the same.
Runtime
At runtime there are a couple of options.
You can background the couchdb container and start the others in the foreground.
docker-compose up -d couchdb
docker-compose up other container names
You can start everything in the background, and only view the logs for particular containers
docker-compose start # or docker-compose up -d
docker-compose logs -f other container names
Build time
To permanently modify logging you could change CouchDB's log config in an image build
couchdb:
container_name: couchdb
image: me/klaemo-couchdb:2.0.0
build:
context: .
dockerfile: Dockerfile.couchdb
ports:
- "5984:5984"
Dockerfile.couchdb
FROM klaemo/couchdb:2.0.0
COPY couchdb.ini /opt/couchdb/etc/local.ini
couchdb.ini needs to contain all the original config settings from the containers /opt/couchdb/etc/local.ini, updating some the log settings from stderr to a file:
[log]
file = /opt/couchdb/log/couch.log
level = info
You can also set log levels specifically for a module
[log_level_by_module]
couch_httpd = info
couch_replicator = info
couch_query_servers = error
You probably want to mount the /opt/couchdb/log directory as a volume from the container host so you are not writing data into the current container instance all the time.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart