I have a problem with supervisor in docker. I use the supervisor to start 4 .sh scripts: datagrid.sh, ml.sh, startmap.sh and dirwatcher.sh.
When I open the container, navigate to the scripts directory and try to start the scripts manually, everything works, the scripts all start, but they don't start on start time. I assume the problem is with the supervisor. Thank you.
The error:
2018-08-08 12:28:08,512 INFO spawned: 'datagrid' with pid 171
2018-08-08 12:28:08,514 INFO spawned: 'dirwatcher' with pid 172
2018-08-08 12:28:08,517 INFO spawned: 'startmap' with pid 173
2018-08-08 12:28:08,519 INFO spawned: 'ml' with pid 175
2018-08-08 12:28:08,520 INFO exited: datagrid (exit status 0; not expected)
2018-08-08 12:28:08,520 INFO exited: dirwatcher (exit status 0; not expected)
2018-08-08 12:28:08,520 INFO exited: startmap (exit status 0; not expected)
2018-08-08 12:28:08,520 INFO exited: ml (exit status 0; not expected)
2018-08-08 12:28:08,527 INFO gave up: datagrid entered FATAL state, too many start retries too quickly
2018-08-08 12:28:08,532 INFO gave up: ml entered FATAL state, too many start retries too quickly
2018-08-08 12:28:08,537 INFO gave up: startmap entered FATAL state, too many start retries too quickly
2018-08-08 12:28:08,539 INFO gave up: dirwatcher entered FATAL state, too many start retries too quickly
My supervisord.conf file:
[supervisord]
nodaemon=false
[program:datagrid]
command=sh /EscomledML/MLScripts/escomled_data_grid.sh start -D
[program:dirwatcher]
command=sh /EscomledML/MLScripts/escomled_dirwatcher.sh start -D
[program:startmap]
command=sh /EscomledML/MLScripts/escomled_startmap.sh start -D
[program:ml]
command=sh /EscomledML/MLScripts/escomled_ml.sh start -D
I use alpine linux in the container.
There are few problems here
The following statement:
[supervisord]
nodaemon=false
This makes the Supervisord run as daemon and the container needs a main process.
Try changing it to
[supervisord]
nodaemon=true
This configuration makes Supervisord itself run as a foreground process, which will keep the container up and running.
From the logs
'520 INFO exited: datagrid (exit status 0; not expected)'
Supervisord is not able to recognise 0 as valid exit code and is exiting the process. Add the following to the conf for all the processes. This will tell Supervisord to try restarting the process only if the exit code is not 0
[program:datagrid]
command=sh /EscomledML/MLScripts/escomled_data_grid.sh start -D
autorestart=unexpected
exitcodes=0
Related
I want to start rsyslog as an additional process in a docker container, because my main service requires it for logging. Therefore trying to set it up with supervisor. But the following fails with a restart-loop for rsyslog. Why?
Dockerfile:
FROM debian:buster-slim
RUN set -e \
&& apt-get update \
&& apt-get install --yes \
rsyslog \
supervisor
COPY /services/rsyslog.conf /etc/rsyslog.d/console.conf
CMD ["supervisord", "-c", "/etc/supervisor.conf"]
supervisor.conf:
[supervisord]
#start in foreground
nodaemon=true
[program:syslog]
command=service rsyslog start
#[programm:another]
#command=...
Result:
process | 2022-10-27 10:07:09,906 INFO Set uid to user 0 succeeded
process | 2022-10-27 10:07:09,907 INFO supervisord started with pid 1
process | 2022-10-27 10:07:10,910 INFO spawned: 'syslog' with pid 9
process | 2022-10-27 10:07:10,987 INFO exited: syslog (exit status 0; not expected)
process | 2022-10-27 10:07:11,990 INFO spawned: 'syslog' with pid 22
process | 2022-10-27 10:07:11,999 INFO exited: syslog (exit status 0; not expected)
process | 2022-10-27 10:07:14,003 INFO spawned: 'syslog' with pid 28
process | 2022-10-27 10:07:14,014 INFO exited: syslog (exit status 0; not expected)
process | 2022-10-27 10:07:17,020 INFO spawned: 'syslog' with pid 34
process | 2022-10-27 10:07:17,030 INFO exited: syslog (exit status 0; not expected)
process | 2022-10-27 10:07:17,031 INFO gave up: syslog entered FATAL state, too many start retries too quickly
I have created a docker file with Supervisor.
I have added 2 processes in the Supervisord properties file.
1st process for executing httpd or tomcat
2nd process will call sh file. The sh file contains echo and read command to accept user input and insert into property file.
Intention is to run 1st process in the background and 2nd process to wait for the user input.
While running the docker image, the 2nd process executing but not waiting for the input?
2021-02-09 16:46:32,901 CRIT Supervisor running as root (no user in config file)
2021-02-09 16:46:32,901 WARN No file matches via include "/etc/supervisord/*.conf"
2021-02-09 16:46:32,903 INFO supervisord started with pid 1
2021-02-09 16:46:33,908 INFO spawned: 'supervisor_stdout' with pid 10
2021-02-09 16:46:33,911 INFO spawned: 'UserInput' with pid 11
2021-02-09 16:46:33,914 DEBG 'UserInput' stdout output:
BMC_DATABASE_HOST:
2021-02-09 16:46:33,939 DEBG 'supervisor_stdout' stdout output:
READY
2021-02-09 16:46:33,940 DEBG supervisor_stdout: ACKNOWLEDGED -> READY
2021-02-09 16:46:34,942 INFO success: supervisor_stdout entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2021-02-09 16:46:34,942 INFO success: UserInput entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
i am a Novice in Docker and wanted to use Sensu for monitoring containers. I have set up a Sensu server and Sensu client ( where my Docker containers are running ) using the below material:
Click [here] (http://devopscube.com/monitor-docker-containers-guide/)
I get the Sensu client information in Uchiwa Dashboard while running the below command:
docker run -d --name sensu-client --privileged \
-v $PWD/load-docker-metrics.sh:/etc/sensu/plugins/load-docker-metrics.sh \
-v /var/run/docker.sock:/var/run/docker.sock \
usman/sensu-client SENSU_SERVER_IP RABIT_MQ_USER RABIT_MQ_PASSWORD CLIENT_NAME CLIENT_IP
However, when i try to fire a new container from the same host machine , i do not get the information of the client in Uchiwa Dashboard.
It would be great if anyone have used Sensu with Docker to monitor Docker containers can guide on the same.
Thanks for the time.
Please logs of the sensu-client
'Supervisord is running as root and it is searching '
2017-01-09 04:11:47,210 CRIT Supervisor running as root (no user in config file)
2017-01-09 04:11:47,212 INFO supervisord started with pid 12
2017-01-09 04:11:48,214 INFO spawned: 'sensu-client' with pid 15
2017-01-09 04:11:49,524 INFO success: sensu-client entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2017-01-09 04:11:49,530 INFO exited: sensu-client (exit status 0; expected)
[ec2-user#ip-172-31-0-89 sensu-client]$ sudo su
[root#ip-172-31-0-89 sensu-client]# docker logs sensu-client
/usr/lib/python2.6/site-packages/supervisor-3.1.3-py2.6.egg/supervisor/options.py:296: UserWarning: Supervisord is running as root and it is searching for its configuration file in default locations (
including its current working directory); you probably want to specify a "-c" argument specifying an absolute path to a configuration file for improved security.
'Supervisord is running as root and it is searching '
2017-01-09 04:11:47,210 CRIT Supervisor running as root (no user in config file)
2017-01-09 04:11:47,212 INFO supervisord started with pid 12
2017-01-09 04:11:48,214 INFO spawned: 'sensu-client' with pid 15
2017-01-09 04:11:49,524 INFO success: sensu-client entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2017-01-09 04:11:49,530 INFO exited: sensu-client (exit status 0; expected)
I have a docker image where I wish:
- to run a passenger server and another daemon for monitoring the passenger server.
- the container to exit as soon as either one of these 2 processes exit even once.
- direct all logs to stdout
In config file, I have put an event listener (Reference: https://serverfault.com/questions/760726/how-to-exit-all-supervisor-processes-if-one-exited-with-0-result/762406#762406) that captures some events for passenger_monit program and executes a script tt.sh.
I can see 1 extra instance of passenger_monit program being spawned and reaching FATAL state after a few tries. The other passenger_monit and passenger_server are fine. The other passenger_monit's events don't reach the eventlistener.
These are the scripts which are not working as expected:
This is the supervisord.conf
[supervisord]
nodaemon=true
stdout_logfile=/dev/fd/1
redirect_stderr=true
stdout_logfile_maxbytes=0
[unix_http_server]
file=%(here)s/supervisor.sock
[supervisorctl]
serverurl=unix://%(here)s/supervisor.sock
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
[program:passenger_monit]
command=./script/passenger_monit.sh
process_name=passenger_monit
startretries=999
redirect_stderr=true
stdout_logfile=/dev/fd/1
stdout_logfile_maxbytes=0
autorestart=true
killasgroup=true
stopasgroup=true
numprocs=1
[program:passenger_server]
command=passenger start
startretries=999
redirect_stderr=true
stdout_logfile=/dev/fd/1
stdout_logfile_maxbytes=0
autorestart=true
killasgroup=true
stopasgroup=true
numprocs=1
[eventlistener:passenger_monit_exit]
command=./tt.sh
process_name=passenger_monit
events=PROCESS_STATE_STARTING,PROCESS_STATE_EXITED,PROCESS_STATE_FATAL
stdout_logfile=/dev/fd/1
stdout_logfile_maxbytes=0
This is the ./script/passenger_monit.sh
#!/bin/bash
set -x
cd /passenger/newrelic_passenger_plugin/
# if exec is not put, then this process is not killed when supervisord exits
exec ./newrelic_passenger_agent
set +x
This is tt.sh
#!/bin/bash
echo "in tt!"
This is the command I run:
docker exec -it -u deploy 56bbbbe4352b supervisord
This is the output I get:
2016-08-26 19:47:29,369 INFO RPC interface 'supervisor' initialized
2016-08-26 19:47:29,369 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2016-08-26 19:47:29,370 INFO supervisord started with pid 2446
2016-08-26 19:47:30,374 INFO spawned: 'passenger_monit' with pid 2452
2016-08-26 19:47:30,377 INFO spawned: 'passenger_server' with pid 2453
in tt!
2016-08-26 19:47:30,392 INFO exited: passenger_monit (exit status 0; not expected)
=============== Phusion Passenger Standalone web server started ===============
PID file: /home/deploy/abc/tmp/pids/passenger.3000.pid
Log file: /home/deploy/abc/log/passenger.3000.log
Environment: development
Accessible via: http://0.0.0.0:3000/
You can stop Phusion Passenger Standalone by pressing Ctrl-C.
Problems? Check https://www.phusionpassenger.com/library/admin/standalone/troubleshooting/
===============================================================================
2016-08-26 19:47:31,565 INFO spawned: 'passenger_monit' with pid 2494
2016-08-26 19:47:31,566 INFO success: passenger_server entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
in tt!
2016-08-26 19:47:31,571 INFO exited: passenger_monit (exit status 0; not expected)
2016-08-26 19:47:33,576 INFO spawned: 'passenger_monit' with pid 2498
in tt!
2016-08-26 19:47:33,583 INFO exited: passenger_monit (exit status 0; not expected)
2016-08-26 19:47:36,588 INFO spawned: 'passenger_monit' with pid 2499
in tt!
2016-08-26 19:47:36,595 INFO exited: passenger_monit (exit status 0; not expected)
2016-08-26 19:47:37,597 INFO gave up: passenger_monit entered FATAL state, too many start retries too quickly
^C2016-08-26 19:47:47,730 WARN received SIGINT indicating exit request
2016-08-26 19:47:47,735 INFO waiting for passenger_server to die
Stopping web server... done
2016-08-26 19:47:47,839 INFO stopped: passenger_server (exit status 2)
This is the output for supervisorctl status
passenger_monit STOPPED Not started
passenger_monit_exit:passenger_monit FATAL Exited too quickly (process log may have details)
passenger_server RUNNING pid 2453, uptime 0:00:14
Output of supervisord -v
3.0b2
The following should work. Notice the 10 second script will be killed after 5 seconds.
[supervisord]
loglevel=warn
nodaemon=true
[program:hello]
command=bash -c "echo waiting 5 seconds . . . && sleep 5"
autorestart=false
numprocs=1
startsecs=0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
[program:world]
command=bash -c "echo waiting 10 seconds . . . && sleep 10"
autorestart=false
numprocs=1
startsecs=0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
[eventlistener:processes]
command=bash -c "echo READY && read line && kill -SIGQUIT $PPID"
events=PROCESS_STATE_STOPPED,PROCESS_STATE_EXITED,PROCESS_STATE_FATAL
Here is how I configure supervisor:
[supervisord]
nodaemon=true
[program:djangoonlyfonts]
command = /code/deploy/gunicorn.sh ; Command to start app
stdout_logfile = /var/log/supervisor/supervisor.log ; Where to write log messages
redirect_stderr = true ; Save stderr in the same log
autostart=true
autorestart=true
gunicorn.sh:
#!/bin/bash
cd /code
export DJANGO_SETTINGS_MODULE=fuentes.settingsser
/usr/local/bin/gunicorn -b 0.0.0.0:8000 --workers=1 fuentes.wsgi:application
I get:
root#3eb7d4cb7a4e:/code# supervisord
/usr/local/lib/python2.7/site-packages/supervisor/options.py:296: UserWarning: Supervisord is running as root and it is searching for its configuration file in default locations (including its current working directory); you probably want to specify a "-c" argument specifying an absolute path to a configuration file for improved security.
'Supervisord is running as root and it is searching '
2016-08-16 07:53:37,712 CRIT Supervisor running as root (no user in config file)
2016-08-16 07:53:37,715 INFO supervisord started with pid 64
2016-08-16 07:53:38,717 INFO spawned: 'djangoonlyfonts' with pid 67
2016-08-16 07:53:38,721 INFO exited: djangoonlyfonts (exit status 127; not expected)
2016-08-16 07:53:39,723 INFO spawned: 'djangoonlyfonts' with pid 68
2016-08-16 07:53:39,728 INFO exited: djangoonlyfonts (exit status 127; not expected)
2016-08-16 07:53:41,732 INFO spawned: 'djangoonlyfonts' with pid 69
2016-08-16 07:53:41,735 INFO exited: djangoonlyfonts (exit status 127; not expected)
2016-08-16 07:53:44,740 INFO spawned: 'djangoonlyfonts' with pid 70
2016-08-16 07:53:44,743 INFO exited: djangoonlyfonts (exit status 127; not expected)
2016-08-16 07:53:45,745 INFO gave up: djangoonlyfonts entered FATAL state, too many start retries too quickly
but when I execute the command directly:
root#3eb7d4cb7a4e:~# /code/deploy/gunicorn.sh
[2016-08-16 07:55:19 +0000] [84] [INFO] Starting gunicorn 19.6.0
[2016-08-16 07:55:19 +0000] [84] [INFO] Listening at: http://0.0.0.0:8000 (84)
[2016-08-16 07:55:19 +0000] [84] [INFO] Using worker: sync
[2016-08-16 07:55:19 +0000] [89] [INFO] Booting worker with pid: 89
The production file is loaded
It just works, which proves the file is a perfectly executable and it actually works.