exit all supervisord processes if one exited - docker

I have a docker image where I wish:
- to run a passenger server and another daemon for monitoring the passenger server.
- the container to exit as soon as either one of these 2 processes exit even once.
- direct all logs to stdout
In config file, I have put an event listener (Reference: https://serverfault.com/questions/760726/how-to-exit-all-supervisor-processes-if-one-exited-with-0-result/762406#762406) that captures some events for passenger_monit program and executes a script tt.sh.
I can see 1 extra instance of passenger_monit program being spawned and reaching FATAL state after a few tries. The other passenger_monit and passenger_server are fine. The other passenger_monit's events don't reach the eventlistener.
These are the scripts which are not working as expected:
This is the supervisord.conf
[supervisord]
nodaemon=true
stdout_logfile=/dev/fd/1
redirect_stderr=true
stdout_logfile_maxbytes=0
[unix_http_server]
file=%(here)s/supervisor.sock
[supervisorctl]
serverurl=unix://%(here)s/supervisor.sock
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
[program:passenger_monit]
command=./script/passenger_monit.sh
process_name=passenger_monit
startretries=999
redirect_stderr=true
stdout_logfile=/dev/fd/1
stdout_logfile_maxbytes=0
autorestart=true
killasgroup=true
stopasgroup=true
numprocs=1
[program:passenger_server]
command=passenger start
startretries=999
redirect_stderr=true
stdout_logfile=/dev/fd/1
stdout_logfile_maxbytes=0
autorestart=true
killasgroup=true
stopasgroup=true
numprocs=1
[eventlistener:passenger_monit_exit]
command=./tt.sh
process_name=passenger_monit
events=PROCESS_STATE_STARTING,PROCESS_STATE_EXITED,PROCESS_STATE_FATAL
stdout_logfile=/dev/fd/1
stdout_logfile_maxbytes=0
This is the ./script/passenger_monit.sh
#!/bin/bash
set -x
cd /passenger/newrelic_passenger_plugin/
# if exec is not put, then this process is not killed when supervisord exits
exec ./newrelic_passenger_agent
set +x
This is tt.sh
#!/bin/bash
echo "in tt!"
This is the command I run:
docker exec -it -u deploy 56bbbbe4352b supervisord
This is the output I get:
2016-08-26 19:47:29,369 INFO RPC interface 'supervisor' initialized
2016-08-26 19:47:29,369 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2016-08-26 19:47:29,370 INFO supervisord started with pid 2446
2016-08-26 19:47:30,374 INFO spawned: 'passenger_monit' with pid 2452
2016-08-26 19:47:30,377 INFO spawned: 'passenger_server' with pid 2453
in tt!
2016-08-26 19:47:30,392 INFO exited: passenger_monit (exit status 0; not expected)
=============== Phusion Passenger Standalone web server started ===============
PID file: /home/deploy/abc/tmp/pids/passenger.3000.pid
Log file: /home/deploy/abc/log/passenger.3000.log
Environment: development
Accessible via: http://0.0.0.0:3000/
You can stop Phusion Passenger Standalone by pressing Ctrl-C.
Problems? Check https://www.phusionpassenger.com/library/admin/standalone/troubleshooting/
===============================================================================
2016-08-26 19:47:31,565 INFO spawned: 'passenger_monit' with pid 2494
2016-08-26 19:47:31,566 INFO success: passenger_server entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
in tt!
2016-08-26 19:47:31,571 INFO exited: passenger_monit (exit status 0; not expected)
2016-08-26 19:47:33,576 INFO spawned: 'passenger_monit' with pid 2498
in tt!
2016-08-26 19:47:33,583 INFO exited: passenger_monit (exit status 0; not expected)
2016-08-26 19:47:36,588 INFO spawned: 'passenger_monit' with pid 2499
in tt!
2016-08-26 19:47:36,595 INFO exited: passenger_monit (exit status 0; not expected)
2016-08-26 19:47:37,597 INFO gave up: passenger_monit entered FATAL state, too many start retries too quickly
^C2016-08-26 19:47:47,730 WARN received SIGINT indicating exit request
2016-08-26 19:47:47,735 INFO waiting for passenger_server to die
Stopping web server... done
2016-08-26 19:47:47,839 INFO stopped: passenger_server (exit status 2)
This is the output for supervisorctl status
passenger_monit STOPPED Not started
passenger_monit_exit:passenger_monit FATAL Exited too quickly (process log may have details)
passenger_server RUNNING pid 2453, uptime 0:00:14
Output of supervisord -v
3.0b2

The following should work. Notice the 10 second script will be killed after 5 seconds.
[supervisord]
loglevel=warn
nodaemon=true
[program:hello]
command=bash -c "echo waiting 5 seconds . . . && sleep 5"
autorestart=false
numprocs=1
startsecs=0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
[program:world]
command=bash -c "echo waiting 10 seconds . . . && sleep 10"
autorestart=false
numprocs=1
startsecs=0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
[eventlistener:processes]
command=bash -c "echo READY && read line && kill -SIGQUIT $PPID"
events=PROCESS_STATE_STOPPED,PROCESS_STATE_EXITED,PROCESS_STATE_FATAL

Related

How to start rsyslog from supervisor in docekr container?

I want to start rsyslog as an additional process in a docker container, because my main service requires it for logging. Therefore trying to set it up with supervisor. But the following fails with a restart-loop for rsyslog. Why?
Dockerfile:
FROM debian:buster-slim
RUN set -e \
&& apt-get update \
&& apt-get install --yes \
rsyslog \
supervisor
COPY /services/rsyslog.conf /etc/rsyslog.d/console.conf
CMD ["supervisord", "-c", "/etc/supervisor.conf"]
supervisor.conf:
[supervisord]
#start in foreground
nodaemon=true
[program:syslog]
command=service rsyslog start
#[programm:another]
#command=...
Result:
process | 2022-10-27 10:07:09,906 INFO Set uid to user 0 succeeded
process | 2022-10-27 10:07:09,907 INFO supervisord started with pid 1
process | 2022-10-27 10:07:10,910 INFO spawned: 'syslog' with pid 9
process | 2022-10-27 10:07:10,987 INFO exited: syslog (exit status 0; not expected)
process | 2022-10-27 10:07:11,990 INFO spawned: 'syslog' with pid 22
process | 2022-10-27 10:07:11,999 INFO exited: syslog (exit status 0; not expected)
process | 2022-10-27 10:07:14,003 INFO spawned: 'syslog' with pid 28
process | 2022-10-27 10:07:14,014 INFO exited: syslog (exit status 0; not expected)
process | 2022-10-27 10:07:17,020 INFO spawned: 'syslog' with pid 34
process | 2022-10-27 10:07:17,030 INFO exited: syslog (exit status 0; not expected)
process | 2022-10-27 10:07:17,031 INFO gave up: syslog entered FATAL state, too many start retries too quickly

WDIO docker run: [1643987609.767][SEVERE]: bind() failed: Cannot assign requested address (99)

There is an error while run wdio test in Docker using Jenkins. I have no idea how to solve this problem :(
The same config run successfully on local env (windows + docker).
This is wdio config. I used default dockerOptions.
wdio.conf
import { config as sharedConfig } from './wdio.shared.conf'
export const config: WebdriverIO.Config = {
...sharedConfig,
...{
host: 'localhost',
services: ['docker'],
dockerLogs: './logs',
dockerOptions: {
image: 'selenium/standalone-chrome:4.1.2-20220131',
healthCheck: {
url: 'http://localhost:4444',
maxRetries: 3,
inspectInterval: 7000,
startDelay: 15000
},
options: {
p: ['4444:4444'],
shmSize: '2g'
}
},
capabilities: [{
acceptInsecureCerts: true,
browserName: 'chrome',
browserVersion: 'latest',
'goog:chromeOptions': {
args: [ '--verbose', '--headless', '--disable-gpu', 'window-size=1920,1800','--no-sandbox', '--disable-dev-shm-usage', '--disable-extensions'],
}
}]
}
}
After that, I try to run UI test via jenkins:
19:37:34 Run `npm audit` for details.
19:37:34 + npm run test:ci -- --spec ./test/specs/claim.BNB.spec.ts
19:37:34
19:37:34 > jasmine-boilerplate#1.0.0 test:ci
19:37:34 > wdio run wdio.ci.conf.ts
And got an error.
Logs attached:
wdio.log
2022-02-04T16:59:20.725Z DEBUG #wdio/utils:initialiseServices: initialise service "docker" as NPM package
2022-02-04T16:59:20.758Z INFO #wdio/cli:launcher: Run onPrepare hook
2022-02-04T16:59:20.760Z DEBUG wdio-docker-service: Docker command: docker run --cidfile /home/jenkins/workspace/tests_e2e1_configure_CI_CD/e2e/selenium_standalone_chrome_latest.cid --rm -p 4444:4444 -p 5900:5900 --shm-size 2g selenium/standalone-chrome:latest
2022-02-04T16:59:20.769Z WARN wdio-docker-service: Connecting dockerEventsListener: 6283
2022-02-04T16:59:20.772Z INFO wdio-docker-service: Cleaning up CID files
2022-02-04T16:59:20.834Z INFO wdio-docker-service: Launching docker image 'selenium/standalone-chrome:latest'
2022-02-04T16:59:20.841Z INFO wdio-docker-service: Docker container is ready
2022-02-04T16:59:20.841Z DEBUG #wdio/cli:utils: Finished to run "onPrepare" hook in 82ms
2022-02-04T16:59:20.842Z INFO #wdio/cli:launcher: Run onWorkerStart hook
2022-02-04T16:59:20.843Z DEBUG #wdio/cli:utils: Finished to run "onWorkerStart" hook in 0ms
2022-02-04T16:59:20.843Z INFO #wdio/local-runner: Start worker 0-0 with arg: run,wdio.ci.conf.ts,--spec,./test/specs/claim.BNB.spec.ts
2022-02-04T16:59:22.034Z DEBUG #wdio/local-runner: Runner 0-0 finished with exit code 1
2022-02-04T16:59:22.035Z INFO #wdio/cli:launcher: Run onComplete hook
2022-02-04T16:59:22.036Z INFO wdio-docker-service: Shutting down running container
2022-02-04T16:59:32.372Z INFO wdio-docker-service: Cleaning up CID files
2022-02-04T16:59:32.373Z INFO wdio-docker-service: Docker container has stopped
2022-02-04T16:59:32.374Z WARN wdio-docker-service: Disconnecting dockerEventsListener: 6283
2022-02-04T16:59:32.374Z DEBUG #wdio/cli:utils: Finished to run "onComplete" hook in 10339ms
2022-02-04T16:59:32.430Z INFO #wdio/local-runner: Shutting down spawned worker
2022-02-04T16:59:32.681Z INFO #wdio/local-runner: Waiting for 0 to shut down gracefully
wdio-0-0.log
2022-02-04T16:59:21.223Z INFO #wdio/local-runner: Run worker command: run
2022-02-04T16:59:21.513Z DEBUG #wdio/config:utils: Found 'ts-node' package, auto-compiling TypeScript files
2022-02-04T16:59:21.714Z DEBUG #wdio/local-runner:utils: init remote session
2022-02-04T16:59:21.717Z DEBUG #wdio/utils:initialiseServices: initialise service "docker" as NPM package
2022-02-04T16:59:21.828Z DEBUG #wdio/local-runner:utils: init remote session
2022-02-04T16:59:21.840Z INFO devtools:puppeteer: Initiate new session using the DevTools protocol
2022-02-04T16:59:21.841Z INFO devtools: Launch Google Chrome with flags: --enable-automation --disable-popup-blocking --disable-extensions --disable-background-networking --disable-background-timer-throttling --disable-backgrounding-occluded-windows --disable-sync --metrics-recording-only --disable-default-apps --mute-audio --no-first-run --no-default-browser-check --disable-hang-monitor --disable-prompt-on-repost --disable-client-side-phishing-detection --password-store=basic --use-mock-keychain --disable-component-extensions-with-background-pages --disable-breakpad --disable-dev-shm-usage --disable-ipc-flooding-protection --disable-renderer-backgrounding --force-fieldtrials=*BackgroundTracing/default/ --enable-features=NetworkService,NetworkServiceInProcess --disable-features=site-per-process,TranslateUI,BlinkGenPropertyTrees --window-position=0,0 --window-size=1200,900 --headless --disable-gpu --window-size=1920,1800 --no-sandbox --disable-dev-shm-usage --disable-extensions
2022-02-04T16:59:21.911Z ERROR #wdio/runner: Error:
at new LauncherError (/home/jenkins/workspace/tests_e2e1_configure_CI_CD/e2e/node_modules/chrome-launcher/src/utils.ts:31:18)
at new ChromePathNotSetError (/home/jenkins/workspace/tests_e2e1_configure_CI_CD/e2e/node_modules/chrome-launcher/dist/utils.js:33:9)
at Object.linux (/home/jenkins/workspace/tests_e2e1_configure_CI_CD/e2e/node_modules/chrome-launcher/src/chrome-finder.ts:153:11)
at Function.getFirstInstallation (/home/jenkins/workspace/tests_e2e1_configure_CI_CD/e2e/node_modules/chrome-launcher/src/chrome-launcher.ts:182:61)
at Launcher.launch (/home/jenkins/workspace/tests_e2e1_configure_CI_CD/e2e/node_modules/chrome-launcher/src/chrome-launcher.ts:252:37)
at launch (/home/jenkins/workspace/tests_e2e1_configure_CI_CD/e2e/node_modules/chrome-launcher/src/chrome-launcher.ts:74:18)
at launchChrome (/home/jenkins/workspace/tests_e2e1_configure_CI_CD/e2e/node_modules/devtools/build/launcher.js:80:55)
at launch (/home/jenkins/workspace/tests_e2e1_configure_CI_CD/e2e/node_modules/devtools/build/launcher.js:179:16)
at Function.newSession (/home/jenkins/workspace/tests_e2e1_configure_CI_CD/e2e/node_modules/devtools/build/index.js:50:54)
at remote (/home/jenkins/workspace/tests_e2e1_configure_CI_CD/e2e/node_modules/webdriverio/build/index.js:67:43)
wdio-chromedriver.log
Starting ChromeDriver 97.0.4692.71 (adefa7837d02a07a604c1e6eff0b3a09422ab88d-refs/branch-heads/4692#{#1247}) on port 9515
Only local connections are allowed.
Please see https://chromedriver.chromium.org/security-considerations for suggestions on keeping ChromeDriver safe.
ChromeDriver was started successfully.
[1643987609.767][SEVERE]: bind() failed: Cannot assign requested address (99)
docker-log.txt
2022-02-04 16:59:21,482 INFO Included extra file "/etc/supervisor/conf.d/selenium.conf" during parsing
2022-02-04 16:59:21,484 INFO supervisord started with pid 7
Trapped SIGTERM/SIGINT/x so shutting down supervisord...
2022-02-04 16:59:22,487 INFO spawned: 'xvfb' with pid 9
2022-02-04 16:59:22,489 INFO spawned: 'vnc' with pid 10
2022-02-04 16:59:22,491 INFO spawned: 'novnc' with pid 11
2022-02-04 16:59:22,492 INFO spawned: 'selenium-standalone' with pid 12
2022-02-04 16:59:22,493 WARN received SIGTERM indicating exit request
2022-02-04 16:59:22,493 INFO waiting for xvfb, vnc, novnc, selenium-standalone to die
Setting up SE_NODE_GRID_URL...
2022-02-04 16:59:22,501 INFO success: xvfb entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2022-02-04 16:59:22,501 INFO success: vnc entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2022-02-04 16:59:22,501 INFO success: novnc entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
Selenium Grid Standalone configuration:
[network]
relax-checks = true
[node]
session-timeout = "300"
override-max-sessions = false
detect-drivers = false
max-sessions = 1
[[node.driver-configuration]]
display-name = "chrome"
stereotype = '{"browserName": "chrome", "browserVersion": "97.0", "platformName": "Linux"}'
max-sessions = 1
Starting Selenium Grid Standalone...
16:59:22.930 INFO [LoggingOptions.configureLogEncoding] - Using the system default encoding
16:59:22.939 INFO [OpenTelemetryTracer.createTracer] - Using OpenTelemetry for tracing
16:59:23.452 INFO [NodeOptions.getSessionFactories] - Detected 4 available processors
16:59:23.493 INFO [NodeOptions.report] - Adding chrome for {"browserVersion": "97.0","browserName": "chrome","platformName": "Linux","se:vncEnabled": true} 1 times
16:59:23.505 INFO [Node.<init>] - Binding additional locator mechanisms: name, id, relative
16:59:23.526 INFO [LocalDistributor.add] - Added node 150c2c05-2b08-4ba9-929a-45fef66bb193 at http://172.17.0.2:4444. Health check every 120s
16:59:23.540 INFO [GridModel.setAvailability] - Switching node 150c2c05-2b08-4ba9-929a-45fef66bb193 (uri: http://172.17.0.2:4444) from DOWN to UP
16:59:23.645 INFO [Standalone.execute] - Started Selenium Standalone 4.1.2 (revision 9a5a329c5a): http://172.17.0.2:4444
2022-02-04 16:59:26,091 INFO waiting for xvfb, vnc, novnc, selenium-standalone to die
2022-02-04 16:59:29,095 INFO waiting for xvfb, vnc, novnc, selenium-standalone to die
2022-02-04 16:59:32,097 INFO waiting for xvfb, vnc, novnc, selenium-standalone to die

Docker with Supervisor

I have created a docker file with Supervisor.
I have added 2 processes in the Supervisord properties file.
1st process for executing httpd or tomcat
2nd process will call sh file. The sh file contains echo and read command to accept user input and insert into property file.
Intention is to run 1st process in the background and 2nd process to wait for the user input.
While running the docker image, the 2nd process executing but not waiting for the input?
2021-02-09 16:46:32,901 CRIT Supervisor running as root (no user in config file)
2021-02-09 16:46:32,901 WARN No file matches via include "/etc/supervisord/*.conf"
2021-02-09 16:46:32,903 INFO supervisord started with pid 1
2021-02-09 16:46:33,908 INFO spawned: 'supervisor_stdout' with pid 10
2021-02-09 16:46:33,911 INFO spawned: 'UserInput' with pid 11
2021-02-09 16:46:33,914 DEBG 'UserInput' stdout output:
BMC_DATABASE_HOST:
2021-02-09 16:46:33,939 DEBG 'supervisor_stdout' stdout output:
READY
2021-02-09 16:46:33,940 DEBG supervisor_stdout: ACKNOWLEDGED -> READY
2021-02-09 16:46:34,942 INFO success: supervisor_stdout entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2021-02-09 16:46:34,942 INFO success: UserInput entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)

Supervisor in Docker doesn't work

I have a problem with supervisor in docker. I use the supervisor to start 4 .sh scripts: datagrid.sh, ml.sh, startmap.sh and dirwatcher.sh.
When I open the container, navigate to the scripts directory and try to start the scripts manually, everything works, the scripts all start, but they don't start on start time. I assume the problem is with the supervisor. Thank you.
The error:
2018-08-08 12:28:08,512 INFO spawned: 'datagrid' with pid 171
2018-08-08 12:28:08,514 INFO spawned: 'dirwatcher' with pid 172
2018-08-08 12:28:08,517 INFO spawned: 'startmap' with pid 173
2018-08-08 12:28:08,519 INFO spawned: 'ml' with pid 175
2018-08-08 12:28:08,520 INFO exited: datagrid (exit status 0; not expected)
2018-08-08 12:28:08,520 INFO exited: dirwatcher (exit status 0; not expected)
2018-08-08 12:28:08,520 INFO exited: startmap (exit status 0; not expected)
2018-08-08 12:28:08,520 INFO exited: ml (exit status 0; not expected)
2018-08-08 12:28:08,527 INFO gave up: datagrid entered FATAL state, too many start retries too quickly
2018-08-08 12:28:08,532 INFO gave up: ml entered FATAL state, too many start retries too quickly
2018-08-08 12:28:08,537 INFO gave up: startmap entered FATAL state, too many start retries too quickly
2018-08-08 12:28:08,539 INFO gave up: dirwatcher entered FATAL state, too many start retries too quickly
My supervisord.conf file:
[supervisord]
nodaemon=false
[program:datagrid]
command=sh /EscomledML/MLScripts/escomled_data_grid.sh start -D
[program:dirwatcher]
command=sh /EscomledML/MLScripts/escomled_dirwatcher.sh start -D
[program:startmap]
command=sh /EscomledML/MLScripts/escomled_startmap.sh start -D
[program:ml]
command=sh /EscomledML/MLScripts/escomled_ml.sh start -D
I use alpine linux in the container.
There are few problems here
The following statement:
[supervisord]
nodaemon=false
This makes the Supervisord run as daemon and the container needs a main process.
Try changing it to
[supervisord]
nodaemon=true
This configuration makes Supervisord itself run as a foreground process, which will keep the container up and running.
From the logs
'520 INFO exited: datagrid (exit status 0; not expected)'
Supervisord is not able to recognise 0 as valid exit code and is exiting the process. Add the following to the conf for all the processes. This will tell Supervisord to try restarting the process only if the exit code is not 0
[program:datagrid]
command=sh /EscomledML/MLScripts/escomled_data_grid.sh start -D
autorestart=unexpected
exitcodes=0

Supervirsor "returning exit status 127; not expected)" on Docker

Here is how I configure supervisor:
[supervisord]
nodaemon=true
[program:djangoonlyfonts]
command = /code/deploy/gunicorn.sh ; Command to start app
stdout_logfile = /var/log/supervisor/supervisor.log ; Where to write log messages
redirect_stderr = true ; Save stderr in the same log
autostart=true
autorestart=true
gunicorn.sh:
#!/bin/bash
cd /code
export DJANGO_SETTINGS_MODULE=fuentes.settingsser
/usr/local/bin/gunicorn -b 0.0.0.0:8000 --workers=1 fuentes.wsgi:application
I get:
root#3eb7d4cb7a4e:/code# supervisord
/usr/local/lib/python2.7/site-packages/supervisor/options.py:296: UserWarning: Supervisord is running as root and it is searching for its configuration file in default locations (including its current working directory); you probably want to specify a "-c" argument specifying an absolute path to a configuration file for improved security.
'Supervisord is running as root and it is searching '
2016-08-16 07:53:37,712 CRIT Supervisor running as root (no user in config file)
2016-08-16 07:53:37,715 INFO supervisord started with pid 64
2016-08-16 07:53:38,717 INFO spawned: 'djangoonlyfonts' with pid 67
2016-08-16 07:53:38,721 INFO exited: djangoonlyfonts (exit status 127; not expected)
2016-08-16 07:53:39,723 INFO spawned: 'djangoonlyfonts' with pid 68
2016-08-16 07:53:39,728 INFO exited: djangoonlyfonts (exit status 127; not expected)
2016-08-16 07:53:41,732 INFO spawned: 'djangoonlyfonts' with pid 69
2016-08-16 07:53:41,735 INFO exited: djangoonlyfonts (exit status 127; not expected)
2016-08-16 07:53:44,740 INFO spawned: 'djangoonlyfonts' with pid 70
2016-08-16 07:53:44,743 INFO exited: djangoonlyfonts (exit status 127; not expected)
2016-08-16 07:53:45,745 INFO gave up: djangoonlyfonts entered FATAL state, too many start retries too quickly
but when I execute the command directly:
root#3eb7d4cb7a4e:~# /code/deploy/gunicorn.sh
[2016-08-16 07:55:19 +0000] [84] [INFO] Starting gunicorn 19.6.0
[2016-08-16 07:55:19 +0000] [84] [INFO] Listening at: http://0.0.0.0:8000 (84)
[2016-08-16 07:55:19 +0000] [84] [INFO] Using worker: sync
[2016-08-16 07:55:19 +0000] [89] [INFO] Booting worker with pid: 89
The production file is loaded
It just works, which proves the file is a perfectly executable and it actually works.

Resources