Specify file inside container as Fluent bit's input - docker

If I have a container writing its log to a file e.g /var/log/app.log, how can I configure Fluent bit to read the container's log from that file?I have this configuration inside my K8S ConfigMap:
input-kubernetes.conf: |
[INPUT]
Name tail
Tag kube.*
Path /var/log/containers/*.log
Exclude_Path /var/log/containers/*_kube-system_*.log, /var/log/containers/*fluent-bit*.log
Parser docker
DB /var/log/flb_kube.db
Mem_Buf_Limit 5MB
Skip_Long_Lines On
Refresh_Interval 10

Related

How to monitor the size of a mounted directory via Telegraf docker container?

I am not particularly fluent with linux, so would appreciate some help.
My system is running with an external drive mounted at /mnt/SSD_240GB and contains two directories that I am trying to monitor the size of, with Telegraf and InfluxDB. These directories are:
/mnt/SSD_240GB/docker_data/InfluxDB/data
and
/mnt/SSD_240GB/docker_data/InfluxDB/wal
Telegraf and InfluxDB are both running in seperate docker containers.
Following this answer, I have made a shell script which contains the following code, which just uses du to get the sizes of any directories that you pass in as arguments:
#!/bin/bash
echo "["
du -s -B1 "$#" | awk '{if (NR!=1) {printf ",\n"};printf " { \"dir_size_bytes\": "$1", \"path\": \""$2"\" }";}'
echo
echo "]"
Running this script directly inside the Telegraf container works as expected, giving the correct file sizes :
Now I try to get Telegraf to automatically run this script at regular intervals by using the exec plugin, by adding the following code to my telegraf.conf file:
[[inputs.exec]]
commands = [ "/etc/telegraf/scripts/get_disk_usage.sh /mnt/SSD_240GB/docker_data/InfluxDB/wal /mnt/SSD_240GB/docker_data/InfluxDB/data" ]
timeout = "1m"
name_override = "du"
name_suffix = ""
data_format = "json"
tag_keys = [ "path" ]
The problem is that now the data which arrives in InfluxDB does not match this. In fact the file sizes returned are always the same (20480 bytes and 4096 bytes, respectively):
Does anyone know how to resolve this? Thanks!
Here is the telegraf section of the docker-compose.yaml file:
telegraf:
image: telegraf
container_name: telegraf_container
restart: always
ports:
- 8125:8125
networks:
- docker_monitoring_network
volumes:
- /mnt/SSD_240GB/docker_config_files/Telegraf/telegraf.conf:/etc/telegraf/telegraf.conf
- /mnt/SSD_240GB/docker_config_files/Telegraf/scripts:/etc/telegraf/scripts
- /mnt/SSD_240GB:/mnt/SSD_240GB:ro

How to set up correctly environment variables for airflow using docker compose in Window 10 ? airflow-init_1 ERROR: AIRFLOW_UID not set

I run docker-compose up airflow-init in cmd to deploy Airflow using the template given at docker-compose.yaml, folder logs, plugins and dags are created:
C:\Users\Diego\Airflow>dir
Volume in drive C has no label.
Volume Serial Number is 4296-D163
Directory of C:\Users\Diego\Airflow
09/18/2021 07:45 PM <DIR> .
09/18/2021 07:45 PM <DIR> ..
09/18/2021 07:45 PM 44 .env
09/18/2021 05:59 PM <DIR> dags
09/18/2021 05:52 PM 8,031 docker-compose.yaml
09/18/2021 07:44 PM <DIR> logs
09/18/2021 06:02 PM <DIR> plugins
2 File(s) 8,075 bytes
5 Dir(s) 22,549,196,800 bytes free
When the command is run, the Error given is:
C:\Users\Diego\Airflow>docker-compose up airflow-init
WARNING: The AIRFLOW_UID variable is not set. Defaulting to a blank string.
WARNING: The AIRFLOW_GID variable is not set. Defaulting to a blank string.
Creating network "airflow_default" with the default driver
Creating volume "airflow_postgres-db-volume" with default driver
Creating airflow_redis_1 ... done Creating airflow_postgres_1 ... done Creating airflow_airflow-init_1 ... done Attaching to airflow_airflow-init_1
airflow-init_1 | ERROR!!!: AIRFLOW_UID not set!
airflow-init_1 | Please follow these instructions to set AIRFLOW_UID and AIRFLOW_GID environment variables:
airflow-init_1 | https://airflow.apache.org/docs/apache-airflow/stable/start/docker.html#initializing-environment
It seems to be an error while synchronizing access from docker to the folders, which is solved in Linux or mac by defining env variables file with echo -e "AIRFLOW_UID=$(id -u)\nAIRFLOW_GID=0" > .env .
I tried to run the echo command, which leads to the .env file created in the directory, but this doesn't help to get airflow-init to run without the error, so this is only for Linux, the mounted volumes in container seems unable to use the windows filesystem user/group permissions, so the container and host computer don't have matching file permissions.
How to set up correctly environment variables for airflow using docker-compose in Window 10?
Docker Desktop 4.0.1
Steps:
Get the file from yaml Airflow community webpage and place it in a directory
Create folders folder logs, dags, plugins in same directory
Run docker-compose up airflow-init
You can manually create the .env file. Simply create a file that will contain two lines:
AIRFLOW_UID=50000
AIRFLOW_GID=0
Just make sure you save the file as text file in whatever editor you edit it. By default if you save it in Windows editor, it might be saved in not the right encoding (http://www2.hawaii.edu/~jamess/textonly.htm)
BTW. As Airflow 2.1.4 (possibly even will be released today) you will not need that file for Windows. This has been fixed already but needs release - so you might even wait for 2.1.4 to be out.

Unable to Run a Simple Python Script on Fluentd

I have a python script called script.py. When I run this script, it creates a logs folder on the Desktop and downloads all the necessary logs from a website and writes them as .log files in this logs folder. I want Fluentd to run this script every 5 minutes and do nothing more. The next source I have on the config file does the real job of sending this log data to another place. If I already have the logs folder on the Desktop, this log files are uploaded correctly to the next destination. But the script never runs. If I delete my logs folder locally, this is the output fluentd gives:
2020-07-27 10:20:42 +0200 [trace]: #0 plugin/buffer.rb:350:enqueue_all: enqueueing all chunks in buffer instance=47448563172440
2020-07-27 10:21:09 +0200 [trace]: #0 plugin/buffer.rb:350:enqueue_all: enqueueing all chunks in buffer instance=47448563172440
2020-07-27 10:21:36 +0200 [debug]: #0 plugin_helper/child_process.rb:255:child_process_execute_once: Executing command title=:exec_input spawn=[{}, "python /home/zohair/Desktop/script.py"] mode=[:read] stderr=:discard
This never gives a logs folder on my Desktop which the script normally does output if run locally like python script.py
If I already have the logs folder, I can see the logs on the stdout normally. Here is my config file:
<source>
#type exec
command python /home/someuser/Desktop/script.py
run_interval 5m
<parse>
#type none
keys none
</parse>
<extract>
tag_key none
</extract>
</source>
<source>
#type tail
read_from_head true
path /home/someuser/Desktop/logs/*
tag sensor_1.log-raw-data
refresh_interval 5m
<parse>
#type none
</parse>
</source>
<match sensor_1.log-raw-data>
#type stdout
</match>
I just need fluentd to run the script and do nothing else, and let the other source take this data and send it to somewhere else. Any solutions?
Problem was solved by creating another #type exec for pip install -r requirements.txt which fulfilled the missing module error which was not being shown on the fluentd error log (Was running fluentd as superuser).

How to add configuration to Logging Agent from Docker Container?

I'm trying to run a docker container on Compute Engine, everything works fine, my PHP app is correctly returning all data but i want to Increase log verbosity.
For now I've added two config files for fluentd inside a container config dir:
This one for nginx:
<source>
#type tail
format nginx
path /var/log/feedbacks/nginx-access.log
pos_file /var/lib/google-fluentd/pos/nginx-access.pos
read_from_head true
tag nginx-access
</source>
<source>
#type tail
format none
path /var/log/feedbacks/nginx-error.log
pos_file /var/lib/google-fluentd/pos/nginx-error.pos
read_from_head true
tag nginx-error
</source>
And this one for PHP log output :
<source>
#type tail
format /^\[(?<time>[\d\-]+ [\d\:]+)\] (?<channel>.+)\.(?<level>(DEBUG|INFO|NOTICE|WARNING|ERROR|CRITICAL|ALERT|EMERGENCY))\: (?<message>[^\{\}]*) (?<context>(\{.+\})|(\[.*\])) (?<extra>(\{.+\})|(\[.*\]))\s*$/
path /var/log/feedbacks/structured.log
pos_file /var/lib/google-fluentd/pos/feedbacks.pos
read_from_head true
tag feedbacks
</source>
I've mounted this 2 config files as follow with the corresponding logs files:
container path: /usr/src/app/var/logs/, host path: /var/log/feedbacks/, mode: r/w
container path: /usr/src/app/docker/runnable/fluentd/, host path: /etc/google-fluentd/config.d/, mode: r/w
But when I /bin/bash to these directories inside the stackdriver-logging-agent there is nothing inside, maybe i'm missing something ...
Thanks for helping !
stackdriver-logging-agent reads a container's logs through the equivalent of docker logs [container]. This provides a consistent API for processes on the host OS to gather container logs.
By default, the container's stdout|stderr are sent to docker logs and it's this stream that the stackdriver-logging-agent is collecting and onsending to the Stackdriver service.
IIUC correctly, you'd need to ensure that your PHP app is generating the richer logs and that these are being sent to stdout|stderr.
If you were to use Nginx's stock Docker image, it does this:
lrwxrwxrwx 1 root root 11 May 8 03:01 access.log -> /dev/stdout
lrwxrwxrwx 1 root root 11 May 8 03:01 error.log -> /dev/stderr
See Docker's documentation here:
https://docs.docker.com/config/containers/logging/
I was unable to find a good explanation for this for Container OS on Google's site.

Setting up Gitlab using Docker on Windows host, issue with shared folders

TLDR;
Does anyone know how to solve the "Failed asserting that ownership of "/var/opt/gitlab/git-data" was git" error?
Background:
I want to set up the Gitlab Docker on WindowsServer2012R2 running Docker toolbox, version 17.04.0-ce, build 4845c56.
Issue/Question
I can't get the shared folder to work properly on the D drive of the server. I read that I needed to add the folder to the VirtualBox VM, which I did via the settings/shared folder menu in the VB GUI. I set a name "gitlab" to the path "D:\data\gitlab" then checked auto-mount, make permanent, and set it to full access.
I started the docker machine and ran "docker-machine ssh $machine-name". I noticed that there was no /media directory and so I added a folder at the home directory (/home/docker/gitlab) and then mounted the shared folder using the following command I found in several forums:
sudo mount -t vboxsf gitlab /home/docker/gitlab
At this point I can add files to the Windows host directory or the Docker VM and it seems to work fine and the test files show up.
Now when I spin up the Gitlab Docker image, I use the following command modified from their documentation:
docker run --detach --hostname gitlab.example.com --publish 80:80 --name gitlab --volume /home/docker/gitlab:/etc/gitlab:Z --volume /home/docker/gitlab/logs:/var/log/gitlab:Z --volume /home/docker/gitlab/data:/var/opt/gitlab:Z gitlab/gitlab-ce
Now I know that it appears to be writing to the shared drive, because all of these files are generated, but then it crashes after a few seconds and I receive the following error log.
Error Log:
Thank you for using GitLab Docker Image!
Current version: gitlab-ce=9.3.6-ce.0
Configure GitLab for your system by editing /etc/gitlab/gitlab.rb file
And restart this container to reload settings.
To do it use docker exec:
docker exec -it gitlab vim /etc/gitlab/gitlab.rb
docker restart gitlab
For a comprehensive list of configuration options please see the Omnibus GitLab readme
https://gitlab.com/gitlab-org/omnibus-gitlab/blob/master/README.md
If this container fails to start due to permission problems try to fix it by executing:
docker exec -it gitlab update-permissions
docker restart gitlab
Installing gitlab.rb config...
Generating ssh_host_rsa_key...
Generating public/private rsa key pair.
Your identification has been saved in /etc/gitlab/ssh_host_rsa_key.
Your public key has been saved in /etc/gitlab/ssh_host_rsa_key.pub.
The key fingerprint is:
SHA256:GyFlf9tl7ZuEbuE+dwZUYiyahdsRzpC1T7kwyUvoD+o root#gitlab.example.com
The key's randomart image is:
+---[RSA 2048]----+
| o .+oo |
| o .o*+o+.o|
| . . o*#+oo+|
| . o+o.Oo= |
| S o o++..|
| + oo + o|
| o .+ + |
| . o. .o|
| E .o..|
+----[SHA256]-----+
Generating ssh_host_ecdsa_key...
Generating public/private ecdsa key pair.
Your identification has been saved in /etc/gitlab/ssh_host_ecdsa_key.
Your public key has been saved in /etc/gitlab/ssh_host_ecdsa_key.pub.
The key fingerprint is:
SHA256:Kb99jG8EtMuTSdIuqBT3GLeD1D0wwTEcQhKgVJUlBjs root#gitlab.example.com
The key's randomart image is:
+---[ECDSA 256]---+
| .o+=*=+=+ |
|.. oo..=.. |
|. E . * . |
| o + +.B |
| +.BS* * |
| . +o= B . |
| . . .o = |
| . o. + |
| . .+. |
+----[SHA256]-----+
Generating ssh_host_ed25519_key...
Generating public/private ed25519 key pair.
Your identification has been saved in /etc/gitlab/ssh_host_ed25519_key.
Your public key has been saved in /etc/gitlab/ssh_host_ed25519_key.pub.
The key fingerprint is:
SHA256:lVxpu0UoyNPWVY6D9c+m/bUTyvKP6vuR4cTOYwQ0j+U root#gitlab.example.com
The key's randomart image is:
+--[ED25519 256]--+
| . o +.=o..|
| +.=o#o.+ |
| o+=.Eo o|
| . + .o.|
| S B +|
| B o= |
| .Oo +|
| ..o+.+|
| .+*+.oo|
+----[SHA256]-----+
Preparing services...
Starting services...
Configuring GitLab package...
/opt/gitlab/embedded/bin/runsvdir-start: line 24: ulimit: pending signals: cannot modify limit: Operation not permitted
/opt/gitlab/embedded/bin/runsvdir-start: line 34: ulimit: max user processes: cannot modify limit: Operation not permitted
/opt/gitlab/embedded/bin/runsvdir-start: line 37: /proc/sys/fs/file-max: Read-only file system
Configuring GitLab...
================================================================================
Error executing action `run` on resource 'ruby_block[directory resource: /var/opt/gitlab/git-data]'
================================================================================
Mixlib::ShellOut::ShellCommandFailed
------------------------------------
Failed asserting that ownership of "/var/opt/gitlab/git-data" was git
---- Begin output of set -x && [ "$(stat --printf='%U' $(readlink -f /var/opt/gitlab/git-data))" = 'git' ] ----
STDOUT:
STDERR: + readlink -f /var/opt/gitlab/git-data
+ stat --printf=%U /var/opt/gitlab/git-data
+ [ UNKNOWN = git ]
---- End output of set -x && [ "$(stat --printf='%U' $(readlink -f /var/opt/gitlab/git-data))" = 'git' ] ----
Ran set -x && [ "$(stat --printf='%U' $(readlink -f /var/opt/gitlab/git-data))" = 'git' ] returned 1
Cookbook Trace:
---------------
/opt/gitlab/embedded/cookbooks/cache/cookbooks/gitlab/libraries/storage_directory_helper.rb:124:in `validate_command'
/opt/gitlab/embedded/cookbooks/cache/cookbooks/gitlab/libraries/storage_directory_helper.rb:112:in `block in validate'
/opt/gitlab/embedded/cookbooks/cache/cookbooks/gitlab/libraries/storage_directory_helper.rb:111:in `each_index'
/opt/gitlab/embedded/cookbooks/cache/cookbooks/gitlab/libraries/storage_directory_helper.rb:111:in `validate'
/opt/gitlab/embedded/cookbooks/cache/cookbooks/gitlab/libraries/storage_directory_helper.rb:87:in `validate!'
/opt/gitlab/embedded/cookbooks/cache/cookbooks/gitlab/definitions/storage_directory.rb:35:in `block (3 levels) in from_file'
Resource Declaration:
---------------------
# In /opt/gitlab/embedded/cookbooks/cache/cookbooks/gitlab/definitions/storage_directory.rb
26: ruby_block "directory resource: #{params[:path]}" do
27: block do
28: # Ensure the directory exists
29: storage_helper.ensure_directory_exists(params[:path])
30:
31: # Ensure the permissions are set
32: storage_helper.ensure_permissions_set(params[:path])
33:
34: # Error out if we have not achieved the target permissions
35: storage_helper.validate!(params[:path])
36: end
37: not_if { storage_helper.validate(params[:path]) }
38: end
39: end
Compiled Resource:
------------------
# Declared in /opt/gitlab/embedded/cookbooks/cache/cookbooks/gitlab/definitions/storage_directory.rb:26:in `block in from_file'
ruby_block("directory resource: /var/opt/gitlab/git-data") do
params {:path=>"/var/opt/gitlab/git-data", :owner=>"git", :group=>nil, :mode=>"0700", :name=>"/var/opt/gitlab/git-data"}
action [:run]
retries 0
retry_delay 2
default_guard_interpreter :default
block_name "directory resource: /var/opt/gitlab/git-data"
declared_type :ruby_block
cookbook_name "gitlab"
recipe_name "gitlab-shell"
block #<Proc:0x000000054a99a8#/opt/gitlab/embedded/cookbooks/cache/cookbooks/gitlab/definitions/storage_directory.rb:27>
not_if { #code block }
end
Platform:
---------
x86_64-linux
Does anyone know how to solve the "Failed asserting that ownership of "/var/opt/gitlab/git-data" was git" error? I'm still somewhat new to Docker/setting up Gitlab, so it's very possible I could have overlooked something simple. I've spent several hours Googling this, and it seems that others also have a lot of issues getting shared folders to work from Windows using the Docker Toolbox, so hopefully this will help others as well.
Background
One solution (maybe not the best) for those of us stuck in a world without native docker, is to use vdi drives and shared folders. The vdi drive can live on an drive we want (which is important if you don't want to use the C drive) and is used to allow the Gitlab docker the ability to chown anything it wants, so this is where we'll store the persistent volumes. The downside is that a vdi is not as transparent as a simple shared folder, thus for backups, a shared folder makes things a little bit easier/transparent.
Disclaimer
I'm not an expert on any of this, so please use caution and take what I say with a grain of salt.
Steps to perform
Create a new vdi drive and shared folder on any drive you'd like
Turn off your docker machine you want to use for gitlab
In virtualbox go into the settings on your docker-machine, then Storage, and click Add Hard Disk icon, then Create new disk
Select VDI (VirtualBox Disk Image) and click Next
Select Dynamically allocated and click Next
Select the name and location you want to store the vdi by clicking the folder with green carrot symbol, then select the max size the vdi can grow to, and click Create
Now in the settings menu, switch to Shared Folders and click Adds new shared folder icon
Create a gitlabbackups folder to where ever you want and select Auto-mount and Make Permanent
Now partition and format the drive
Start/enter the docker machine (either use VBox window or docker-machine ssh <your docker machine name> from cmd prompt)
Run fdisk -l to list the available drives, and if you've only mounted the one extra vdi drive, you should see something like /dev/sdb
The next steps are irreversible, so perform it at your own discretion: enter command fdisk /dev/sdb then n for new partition, p for primary, and 1
Now format the new partition (you might need sudo as well): mkfs.ext4 /dev/sdb1
Run docker with persistent volumes on second vdi and backups in shared folder
Sample Dockerfile:
FROM gitlab/gitlab-ce:latest
RUN apt-get update
RUN apt-get install -y cron
# Add a cron job to backup everyday
RUN echo "0 5 * * * /opt/gitlab/bin/gitlab-rake gitlab:backup:create STRATEGY=copy CRON=1" | crontab -
# For an unknown reason, the cron job won't actually run unless cron is restarted
CMD service cron restart && \
/assets/wrapper
Sample docker-compose.yml:
version: "3.0"
services:
gitlab:
build: .
restart: always
ports:
- "80:80"
volumes:
# These volumes are on the vdi we created above
- "/mnt/sdb1/etc/gitlab/:/etc/gitlab"
- "/mnt/sdb1/var/log/gitlab:/var/log/gitlab"
- "/mnt/sdb1/var/opt/gitlab:/var/opt/gitlab"
# This volume sits in the shared folder defined above
- "/gitlabbackups:/var/opt/gitlab/backups"
cap_add:
# These seem to be necessary for the mounted drive to work properly
# https://docs.docker.com/engine/reference/run/#runtime-privilege-and-linux-capabilities
- SYS_ADMIN
- DAC_READ_SEARCH
Because there seems to be an issue with auto mounting the vdi, use a startup script, for example (assuming you used a D drive, just replace anything inside <...> as needed), sample run.bat:
#cd /d D:\<path to docker-compose.yml, assuming it's on the D drive>
#docker-machine start <docker machine name>
#FOR /f "tokens=*" %%i IN ('docker-machine env <docker machine name>') DO #%%i
#docker-machine ssh <docker machine name> sudo mount /dev/sdb1 /mnt/sdb1
#docker-compose build
#docker-compose up -d
#REM If the docker machine was completely off, running only 'docker-compose up -d' will
#REM not mount the volumes properly. Stopping and restarting the container results in
#REM the volumes mounting properly.
#docker stop <gitlab container name>
#docker start <gitlab container name>
#pause
Note: the gitlab container name can be found by running docker-compose up once and then docker ps -a to check it, but it's usually follows the convention <directory compose file is in>_<name in the compose file, e.g. gitlab here>_1
Assuming all went well and you change the stuff in the <...>'s above for your situation, you should be able to run the batch file and have gitlab up and running in such a way that it stores everything on the alternate drive, persistent working files in the vdi (to get around VBox POSIX limitations), and backups transparently stored in the shared folder.
Hope this helps other poor souls that don't have access to native docker yet.

Resources