I am using fluentd with the tg-agent installation. And I observed my default td-agent.log file is growing without having any log rotation.
I am using the following command to run the td-agent
/etc/init.d/td-agent start
And I found the following link which tells how to configure the rotation and it seems like this is with the fluent itself.
https://docs.fluentd.org/deployment/logging
anyone knows how to configure the rotation with the command I am using? I have the td-agent config file also.
You can do this in two ways , first with td-agent itself and for this you need to update the td-agent init file /etc/init.d/td-agent. you have to find the below line in the file
TD_AGENT_ARGS="${TD_AGENT_ARGS:-${TD_AGENT_BIN_FILE} --log ${TD_AGENT_LOG_FILE} ${TD_AGENT_OPTIONS}}"
and update it to
TD_AGENT_ARGS="${TD_AGENT_ARGS:-${TD_AGENT_BIN_FILE} --log-rotate-age 5 --log-rotate-size 104857600 --log ${TD_AGENT_LOG_FILE} ${TD_AGENT_OPTIONS}}"
then restart td-agent and the result will be as shown below
16467 /opt/td-agent/embedded/bin/ruby /usr/sbin/td-agent --log-rotate-age 5 --log-rotate-size 104857600 --log /var/log/td-agent/td-agent.log --use-v1-config --group td-agent --daemon /var/run/td-agent/td-agent.pid
16472 /opt/td-agent/embedded/bin/ruby -Eascii-8bit:ascii-8bit /usr/sbin/td-agent --log-rotate-age 5 --log-rotate-size 104857600 --log /var/log/td-agent/td-agent.log --use-v1-config --group td-agent --daemon /var/run/td-agent/td-agent.pid --
The second method is to use logrotate for rotating the logs, create the below file on your server and make sure that logrotate is installed and it will take care of rotating the logs
cat /etc/logrotate.d/td-agent
/var/log/td-agent/td-agent.log {
daily
rotate 30
compress
delaycompress
notifempty
create 640 td-agent td-agent
sharedscripts
postrotate
pid=/var/run/td-agent/td-agent.pid
if [ -s "$pid" ]
then
kill -USR1 "$(cat $pid)"
fi
endscript
}
Related
I'm looking to redirect some logs from a command run with kubectl exec to that pod's logs, so that they can be read with kubectl logs <pod-name> (or really, /var/log/containers/<pod-name>.log). I can see the logs I need as output when running the command, and they're stored inside a separate log directory inside the running container.
Redirecting the output (i.e. >> logfile.log) to the file which I thought was mirroring what is in kubectl logs <pod-name> does not update that container's logs, and neither does redirecting to stdout.
When calling kubectl logs <pod-name>, my understanding is that kubelet gets them from it's internal /var/log/containers/ directory. But what determines which logs are stored there? Is it the same process as the way logs get stored inside any other docker container?
Is there a way to examine/trace the logging process, or determine where these logs are coming from?
Logs from the STDOUT and STDERR of containers in the pod are captured and stored inside files in /var/log/containers. This is what is presented when kubectl log is run.
In order to understand why output from commands run by kubectl exec is not shown when running kubectl log, let's have a look how it all works with an example:
First launch a pod running ubuntu that are sleeping forever:
$> kubectl run test --image=ubuntu --restart=Never -- sleep infinity
Exec into it
$> kubectl exec -it test bash
Seen from inside the container it is the STDOUT and STDERR of PID 1 that are being captured. When you do a kubectl exec into the container a new process is created living alongside PID 1:
root#test:/# ps -auxf
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 7 0.0 0.0 18504 3400 pts/0 Ss 20:04 0:00 bash
root 19 0.0 0.0 34396 2908 pts/0 R+ 20:07 0:00 \_ ps -auxf
root 1 0.0 0.0 4528 836 ? Ss 20:03 0:00 sleep infinity
Redirecting to STDOUT is not working because /dev/stdout is a symlink to the process accessing it (/proc/self/fd/1 rather than /proc/1/fd/1).
root#test:/# ls -lrt /dev/stdout
lrwxrwxrwx 1 root root 15 Nov 5 20:03 /dev/stdout -> /proc/self/fd/1
In order to see the logs from commands run with kubectl exec the logs need to be redirected to the streams that are captured by the kubelet (STDOUT and STDERR of pid 1). This can be done by redirecting output to /proc/1/fd/1.
root#test:/# echo "Hello" > /proc/1/fd/1
Exiting the interactive shell and checking the logs using kubectl logs should now show the output
$> kubectl logs test
Hello
Recently I come across a linux application design. The indent of the application is to log the ethernet frames via dumpcap <> api in linux. But they implemented as below:
Create a new process using fork()
Call dumpcap <> in execl() as shown below
a. execl("/bin/sh", "/bin/sh", "-c", dumpcap<>, NULL);
b. sudo dumpcap -i "eth0" -B 1 -b filesize:5 -w "/mnt/Test_1561890567.pcapng" -t -q
They send a SIGTERM to kill the process
The problem facing now is when ever we run the command from the process after 1184 or 1185 no:files then dumpcap stops logging. The process and thread is alive the command we can see in top command.
I have the following shell script that allows me to start my rails app, let's say it's called start-app.sh:
#!/bin/bash
cd /var/www/project/current
. /home/user/.rvm/environments/ruby-2.3.3
RAILS_SERVE_STATIC_FILES=true RAILS_ENV=production nohup bundle exec rails s -e production -p 4445 > /var/www/project/log/production.log 2>&1 &
the file above have permissions of:
-rwxr-xr-x 1 user user 410 Mar 21 10:00 start-app.sh*
if i want to check the process I do the following:
ps aux | grep -v grep | grep ":4445"
it'd give me the following output:
user 2960 0.0 7.0 975160 144408 ? Sl 10:37 0:07 puma 3.12.0 (tcp://0.0.0.0:4445) [20180809094218]
P.S: the reason i grep ":4445" is because i have few processes running on different ports. (for different projects)
now coming to monit, i used apt-get to install it, and the latest version from repo is 5.16, as i'm running on Ubuntu 16.04, also note that monit is running as root, that's why i specified the gid uid in the following. (because the start script is used to be executed from "user" and not "root")
Here's the configuration for monit:
set daemon 20 # check services at 20 seconds interval
set logfile /var/log/monit.log
set idfile /var/lib/monit/id
set statefile /var/lib/monit/state
set eventqueue
basedir /var/lib/monit/events # set the base directory where events will be stored
slots 100 # optionally limit the queue size
set mailserver xx.com port xxx
username "xx#xx.com" password "xxxxxx"
using tlsv12
with timeout 20 seconds
set alert xx#xx.com
set mail-format {
from: xx#xx.com
subject: monit alert -- $EVENT $SERVICE
message: $EVENT Service $SERVICE
Date: $DATE
Action: $ACTION
Host: $HOST
Description: $DESCRIPTION
}
set limits {
programOutput: 51200 B
sendExpectBuffer: 25600 B
fileContentBuffer: 51200 B
networktimeout: 10 s
}
check system $HOST
if loadavg (1min) > 4 then alert
if loadavg (5min) > 2 then alert
if cpu usage > 90% for 10 cycles then alert
if memory usage > 85% then alert
if swap usage > 35% then alert
check process nginx with pidfile /var/run/nginx.pid
start program = "/bin/systemctl start nginx"
stop program = "/bin/systemctl stop nginx"
check process redis
matching "redis"
start program = "/bin/systemctl start redis"
stop program = "/bin/systemctl stop redis"
check process myapp
matching ":4445"
start program = "/bin/bash -c '/home/user/start-app.sh'" as uid "user" and gid "user"
stop program = "/bin/bash -c /home/user/stop-app.sh" as uid "user" and gid "user"
include /etc/monit/conf.d/*
include /etc/monit/conf-enabled/*
Now monit, is detecting and alerting me when the process goes down (if i kill it manually) and when it's manually recovered, but it won't start that shell script automatically.. and according to /var/log/monit.log, it's showing the following:
[UTC Aug 13 10:16:41] info : Starting Monit 5.16 daemon
[UTC Aug 13 10:16:41] info : 'production-server' Monit 5.16 started
[UTC Aug 13 10:16:43] error : 'myapp' process is not running
[UTC Aug 13 10:16:46] info : 'myapp' trying to restart
[UTC Aug 13 10:16:46] info : 'myapp' start: /bin/bash
[UTC Aug 13 10:17:17] error : 'myapp' failed to start (exit status 0) -- no output
So far what I see when monit tries to execute the script is that it tries to load it (i can see it for less than 3 seconds using ps aux | grep -v grep | grep ":4445", but this output is different from the above output i showed up, it shows the content of the shell script being executed and specifically this one:
blablalba... nohup bundle exec rails s -e production -p 4445
and then it disappears. then it tries to re-execute the shell.. again and again...
What am I missing, and what is wrong with my configuration? note that I can't change anything in the start-app.sh because it's on production and working 100%. (i just want to monitor it)
Edit: To my understanding and experience, it seems to be a Environment Variable issue or path issue, but i'm not sure how to solve it, it doesn't make any sense to put the env variables inside monit .. what if someone else wanted to edit that shell script or add something new? i hope you get my point
As i expected, it was user-environment issue and i solved it by editing monit configuration as below:
Before (not working)
check process myapp
matching ":4445"
start program = "/bin/bash -c '/home/user/start-app.sh'" as uid "user" and gid "user"
stop program = "/bin/bash -c /home/user/stop-app.sh" as uid "user" and gid "user"
After (working)
check process myapp
matching ":4445"
start program = "/bin/su -s /bin/bash -c '/home/user/start-app.sh' user"
stop program = "/bin/su -s /bin/bash -c '/home/user/stop-app.sh' user"
Explanation: i removed (uid and gid) as "user" from monit because it will only execute the shell script in the name of "user" but it won't get/import/use user's env path, or env variables.
Assume the following:
I have a program myprogram inside a docker container
I'm running the docker container with
docker run --privileged=true my-label/my-container
Inside the container - the program is being run with:
strace -f -e trace=desc ./myprogram
What I see is that the strace (despite having the -f on) doesn't follow all the child processes.
I see the following output from strace
[pid 10] 07:36:46.668931 write(2, "..\n"..., 454 <unfinished ...>
<stdout of ..>
<stdout other output - but I don't see the write commands - so probably from a child process>
[pid 10] 07:36:46.669684 write(2, "My final output\n", 24 <unfinished ...>
<stdout of My final output>
What I want to see is the other write commands.
Now I should see the the other write commands - because I'm using -f.
What I think is happening is that running inside docker makes the process handling and security different.
My question is: Does strace -f work differently when run inside a docker container?
Note that this application starts and stops in 2 seconds - so the tracing tool has to follow the application lifecycle - like strace does. Connecting to a server background process won't work.
It turns out strace truncates string output - you have to explicitly tell it that you want more than the first n (10?) string chars. You do this with -s 800.
strace -s 800 -ff ./myprogram
You can also get all the write commands by asking strace explicitly with -e write.
strace -s 800 -ff -e write ./myprogram
I am trying to run ruby on rails under passenger with apache2 under fedora 19 and I got this error in log:
[Tue Feb 25 09:37:52.367683 2014] [passenger:error] [pid 2779] ***
Passenger could not be initialized because of this error: Unable to
start the Phusion Passenger watchdog because it encountered the
following error during startup: Cannot change the directory
'/tmp/passenger.1.0.2779/generation-1/buffered_uploads' its UID to 48
and GID to 48: Operation not permitted (errno=1)
That directory (/tmp/passenger.1.0.2779) doesn't even exist. I think that problem is with selinux. I tried to solve it about 4 hours. Httpd is running under user apache and group apache, I tried:
cat /var/log/audit/audit.log | grep passenger | audit2allow -M
passenger semodule -i passenger.pp
but still nothing.
In your case, you should switch SELinux into Permissive mode at first, then try to capture the audit log from starting Apache to run your application.1
Once you got the home page of your application, you can build your custom policy with the logs.
Switch SELinux into Permissive mode and clean audit.log
]# setenforce 0
]# rm /var/log/audit/audit.log
]# service auditd restart
Restart Apache
]# service httpd restart
Try to open your application with a web browser
It might give more information about what is happenning when you application is running.
Make a custom policy module to allow these actions
]# mkdir work
]# cd work
]# grep httpd /var/log/audit/audit.log | audit2allow -M passenger
]# ls
passenger.pp passenger.te
Load postgrey policy module using the 'semodule' command into the current SELinux policy:
]# semodule -i passenger.pp
]# setenforce 1
Restart Apache
]# service httpd restart
References:
http://wiki.centos.org/HowTos/SELinux#head-faa96b3fdd922004cdb988c1989e56191c257c01
I ran into a similar error, with a startup error about being unable to create a directory that did not exist. (logs, not tmp, but same sort of thing) I, too, battled with it for an hour and couldn't make sense of it. I created/deleted/chmod the directory many ways without success.
The fix for me was to change the parameters to passenger-start. Initially, my Docker container started passenger with:
exec bundle exec passenger start --auto --disable-security-update-check --min-instances 20 --max-pool-size 20 --max-request-queue-size 500
I removed all parameters, leaving just this:
exec bundle exec passenger start
At this point, passenger could create the log folder and file, and all was well. I could have restored the params at this point, but we decided they were not needed for the development environment so left them out moving ahead.
In hindsight, I have a hunch that I deleted the log directory while a file in it was still open, and the file system persisted that condition in some way. But that's just a hunch. Perhaps simply rebooting my Mac would have fixed it...