how to capture dask-worker console log in a file.? - dask

def my_task():
print("dask_worker_log_msg")
...
client = Client()
future = client.submit(my_task)
print("dask_client_log_msg")
...
I want to capture "dask_client_log_msg" and other task-logs in one file and "dask_worker_log_msg" and other client-logs in a separate file. As obviously client will run in a separate process altogether than the worker. so I need one process should log all its message in a separate file. Thanks.!

You can get logs from your workers with the Client.get_worker_logs method. You can also download logs from the dashboard in the info pane.

Here's a solution if you're trying to implement a Dask cluster and need the logs from all jobs that it runs (including logs from your scripts from print or logger.info):
Add a redirect in your bash script starting the worker:
dask-worker >> dask_worker.log 2>&1
In your script, set your logger to dask.distributed, like so:
logger = logging.getLogger("distributed.worker")
Configure the log format in .config/dask/distributed.yaml
See also How to capture logs from workers from a Dask-Yarn job?

Related

Make one instance of multiple uWSGI workers perform a extra function

I have a python flask app running on uWSGI with a config file that specifics it to spawn multiple workers (which I am assuming are identical processes).
Everything works well except for one part: the python app runs a bash command to download an update a database every day using a scheduler, which needs to run only once but multiple processes means that it runs multiple times at the same time, thus corrupting the downloaded file.
Is there a way to run this bash command on only one instance of uWSGI workers? I can't run the bash command as a separate cron job (the database update has to integrate seamlessly with the app).
Check The uWSGI cron-like interface
uWSGI’s master has an internal cron-like facility that can generate
events at predefined times. You can use it
You can set the options for example to:
[uwsgi]
; every two hours
cron = 0 -2 -1 -1 -1 /usr/bin/backup_my_home --recursive
Is that sufficient?

Stopping dask-ssh created scheduler from the Client interface

I am running Dask on a SLURM-managed cluster.
dask-ssh --nprocs 2 --nthreads 1 --scheduler-port 8786 --log-directory `pwd` --hostfile hostfile.$JOBID &
sleep 10
# We need to tell dask Client (inside python) where the scheduler is running
scheduler="`hostname`:8786"
echo "Scheduler is running at ${scheduler}"
export ARL_DASK_SCHEDULER=${scheduler}
echo "About to execute $CMD"
eval $CMD
# Wait for dash-ssh to be shutdown from the python
wait %1
I create a Client inside my python code and then when finished, I shut it down.
c=Client(scheduler_id)
...
c.shutdown()
My reading of the dask-ssh help is that the shutdown will shutdown all workers and then the scheduler. But it does not stop the background dask-ssh and so eventually the job timeouts.
I've tried this interactively in the shell. I cannot see how to stop the scheduler.
I would appreciate any help.
Thanks,
Tim
Recommendation with --scheduler-file
First, when setting up with SLURM you might consider using the --scheduler-file option, which allows you to coordinate the scheduler address using your NFS (which I assume you have given that you're using SLURM). Recommend reading this doc section: http://distributed.readthedocs.io/en/latest/setup.html#using-a-shared-network-file-system-and-a-job-scheduler
dask-scheduler --scheduler-file /path/to/scheduler.json
dask-worker --scheduler-file /path/to/scheduler.json
dask-worker --scheduler-file /path/to/scheduler.json
>>> client = Client(scheduler_file='/path/to/scheduler.json')
Given this it also becomes easier to use the sbatch or qsub command directly. Here is an example with SGE's qsub
# Start a dask-scheduler somewhere and write connection information to file
qsub -b y /path/to/dask-scheduler --scheduler-file /path/to/scheduler.json
# Start 100 dask-worker processes in an array job pointing to the same file
qsub -b y -t 1-100 /path/to/dask-worker --scheduler-file /path/to/scheduler.json
Client.shutdown
It looks like client.shutdown only shuts down the client. You're correct that this is inconsistent with the docstring. I've raised an issue here: https://github.com/dask/distributed/issues/1085 for tracking further developments.
In the meantime
These three commands should suffice to tear down the workers, close the scheduler, and stop the scheduler process
client.loop.add_callback(client.scheduler.retire_workers, close_workers=True)
client.loop.add_callback(client.scheduler.terminate)
client.run_on_scheduler(lambda dask_scheduler: dask_scheduler.loop.stop())
What people usually do
Typically people start and stop clusters with whatever means that they started them. This might involve using SLURM's kill command. We should make the client-focused way more consistent though regardless.

Docker - Handling multiple services in a single container

I would like to start two different services in my Docker container and exit the container as soon as one of them exits. I looked at supervisor, but I can't find how to get it to quit as soon as one of the managed applications exits. It tries to restart them up to three times, as is the standard setting and then just sits there doing nothing. Is supervisor able to do this or is there any other tool for this? A bonus would be if there also was a way to let both managed programs write to stdout, tagged with their application name, e.g.:
[Program 1] Some output
[Program 2] Some other output
[Program 1] Output again
Since you asked if there was another tool... we designed and wrote a powerful replacement for supervisord that is designed specifically for Docker. It automatically terminates when all applications quit, as well as has special service settings to control this behavior, plus will redirect stdout with tagged syslog-compatible output lines as well. It's open source, and being used in production.
Here is a quick start for Docker: http://garywiz.github.io/chaperone/guide/chap-docker-simple.html
There is also a complete set of tested base-images which are a good example at: https://github.com/garywiz/chaperone-docker, but these might be overkill and the earlier quickstart may do the trick.
I found solutions to both of my requirements by reading through the docs some more.
Exit supervisord on application exit
This can be achieved by using a custom eventlistener. I had to add the following segment into my supervisord configuration file:
[eventlistener:shutdownevent]
command=/shutdownhandler.sh
events=PROCESS_STATE_EXITED
supervisord will start the referenced script and upon the given event being triggered (PROCESS_STATE_EXITED is triggered after the exit of one of the managed programs and it not restarting automatically) will send a line containing data about the event on the scripts stdin.
The referenced shutdownhandler-script contains:
#!/bin/bash
while :
do
echo -en "READY\n"
read line
kill $(cat /supervisord.pid)
echo -en "RESULT 2\nOK"
done
The script has to indicate being ready by sending "READY\n" on its stdout, after which it may receive an event data line on its stdin. For my use case upon receival of a line (meaning one of the managed programs has exited), a SIGTERM is sent to the supervisord process being found by the pid it leaves in its pid file (situated in the root directory by default). For technical completeness, I also included a positive answer for the eventlistener, though that one should never matter.
Tagged output on stdout
I did this by simply starting a tail process in the background before starting supervisord, tailing the programs output log and piping the lines through ts (from the moreutils package) to prepend a tag to it. This way it shows up via docker logs with an easy way to see which program actually wrote the line.
tail -fn0 /var/log/supervisor/program1.log | ts '[Program 1]' &

supervisord: is it possible to redirect subprocess stdout back to supervisord?

I'm using supervisord as the entry point for Docker containers as described in https://docs.docker.com/articles/using_supervisord/,
I want all logs to be written to stdout so I can take advantage of builtin tools like docker logs or systemd's journal, especially if running the containers on CoreOS.
for stderr there's redirect_stderr=true option for subprocesses,
is it possible to redirect the subprocess stdout back to supervisord somehow and not deal with actual log files ?
You can redirect the program's stdout to supervisor's stdout using the following configuration options:
stdout_logfile=/dev/fd/1
stdout_logfile_maxbytes=0
Explanation:
When a process opens /dev/fd/1 (which is the same as /proc/self/fd/1), the system actually clones file descriptor #1 (stdout) of that process. Using this as stdout_logfile therefore causes supervisord to redirect the program's stdout to its own stdout.
stdout_logfile_maxbytes=0 disables log file rotation which is obviously not meaningful for stdout. Not specifying this option will result in an error because the default value is 50MB and supervisor is not smart enough to detect that the specified log file is not a regular file.
For more information:
http://veithen.github.io/2015/01/08/supervisord-redirecting-stdout.html

Avoid generating empty STDOUT and STDERR files with Sun Grid Engine (SGE) and array jobs

I am running array jobs with Sun Grid Engine (SGE).
My carefully scripted array job workers generate no stdout and no stderr when they function properly. Unfortunately, SGE insists on creating an empty stdout and stderr file for each run.
Sun's manual states:
STDOUT and STDERR of array job tasks will be written into dif-
ferent files with the default location
.['e'|'o']'.'
In order to change this default, the -e and -o options (see
above) can be used together with the pseudo-environment-vari-
ables $HOME, $USER, $JOB_ID, $JOB_NAME, $HOSTNAME, and
$SGE_TASK_ID.
Note, that you can use the output redirection to divert the out-
put of all tasks into the same file, but the result of this is
undefined.
I would like to have the output files suppressed if they are empty. Is there any way to do this?
No, there is no way to do this.

Resources