I have a docker-compose.yml file that defines a number of services. One is a redis instance, and another is a queue-worker.
The queue-worker fetches jobs from redis and performs the necessary work.
Currently, I have the queue-worker's stop_grace_period set to 5m, within my docker-compose.yml. The idea here is that when I run docker-compose down, the queue-worker will have 5 minutes to deal with any remaining jobs in the queue, before shutting down.
I would like to improve the situation, if possible, by performing a check when docker-compose down is called. If the result of the check is true, e.g. curl http://project/total-jobs-in-queue == 0 then go ahead and stop the queue-worker container immediately. If the result is false, i.e. there are still jobs in the work queue, delay container shutdown until the result of the check is true.
I could write a bash script to perform this check, and shut the containers individually, from within the script, however if it was possible to configure this in such a way that the standard docker-compose up/down commands could continue to be used, that would be much more preferable.
Is this possible?
Related
I have looked for a bit on Stack Overflow for a way to have a container start up and wait for an external connection but have not seen anything.
Here is what my process looks like currently:
Non-Docker external process reaches out at X interval and tells system to run a command.
Command runs.
System should remain idle until the next interval.
Now I have seen a few options with --wait or sleep but I would think that would not allow the container to receive the connection.
I also looked at the wait for container script that is often recommended but in this case I need the container to wait for a script to call it on non defined intervals.
I have tried having this just run the help command for my process but it then fails the container after a bit of time and makes it a mess for finding anything.
Additionally I have tried to have the container start with no command just to run the base OS and wait for the call but that did not work either.
I was looking at this wrong.
Ended up just running like any other webserver and database server.
I am trying to stop the docker container automatically after 1 hour. I mean, if there is no process going on or the container is idle for 1 hour, then stop that container. Is this possible to do it programmatically within the Dockfile? Any thoughts would be helpful.
Thanks in advance.
The closest solution that fits your problem supported by Dockerfile would be HEALTHCHECK directive e.g. HEALTHCHECK [OPTIONS] CMD command . Here you can specify interval (e.g. 1 hour) and time out.
--interval=DURATION (default: 30s)
--timeout=DURATION (default: 30s)
--start-period=DURATION (default: 0s)
--retries=N (default: 3)
Other than that you would have to create custom shell script that is triggered by cronjob every 1 hour. In this script you would stop foreground process and by that stooping the running container.
As far as I know such a scenario is not part of the docker workflow.
The container is alive a long as its main process is alive. When that project (PID: 1) exits (with error or success) then the container also stops.
So the only way I see is to either build this logic inside your program (the main process in the container) or wrap the program in a shell script that kills the process based on some rule (like no log entries for a certain amount of time).
I have a Docker image that needs to be run in an environment where I have no admin privileges, using Slurm 17.11.8 in RHEL. I am using udocker to run the container.
In this container, there are two applications that needs to run:
[1] ROS simulation (there is a rosnode that is a TCP client talking to [2])
[2] An executable (TCP server)
So [1] and [2] needs to run together and they shared some common files as well. Usually, I run them in separate terminals. But I have no idea how to do this with SLURM.
Possible Solution:
(A) Use two containers of the same image, but their files will be stored locally. Could use volumes instead. But this requires me to change my code significantly and maybe break compatibility when I am not running it as containers (e.g in Eclipse).
(B) Use a bash script to launch two terminals and run [1] and [2]. Then srun this script.
I am looking at (B) but have no idea how to approach it. I looked into other approaches but they address sequential executions of multiple processes. I need these to be concurrent.
If it helps, I am using xfce-terminal though I can switch to other terminals such as Gnome, Konsole.
This is a shot in the dark since I don't work with udocker.
In your slurm submit script, to be submitted with sbatch, you could allocate enough resources for both jobs to run on the same node(so you just need to reference localhost for your client/server). Start your first process in the background with something like:
udocker container_name container_args &
The & should start the first container in the background.
You would then start the second container:
udocker 2nd_container_name more_args
This would run without & to keep the process in the foreground. Ideally, when the second container completes the script would complete and slurm cleanup would kill the first container. If both containers will come to an end cleanly you can put a wait at the end of the script.
Caveats:
Depending on how Slurm is configured, processes may not be properly cleaned up at the end. You may need to capture the PID of the first udocker as a variable and kill it before you exit.
The first container may still be processing when the second completes. You may need to add a sleep command at the end of your submission script to give it time to finish.
Any number of other gotchas may exist that you will need to find and hopefully work around.
Problem domain
Imagine that a stateful container is being managed by Swarm, e.g. a database, and another container is relying on it, e.g. a service that is executing a long-running job (minutes, sometimes hours) that does not tolerate the database (or even itself) to go down while it's executing.
To give an example, a database importing a multi GB dump.
There's also a CI/CD system in place which takes care of building new versions of the containers and deploying them to the Swarm, or pushing the image to Docker Hub which then calls a defined webhook which fires off the deployment event.
Question
Is there any way I can build my containers so that Swarm can know whether it's ok to update it or not? Similarly how HEALTHCHECK reports whether it needs to be restarted, something that would let Swarm know that 'It's safe to restart this container now'.
Or is it the CI/CD system's responsibility to check whether the stateful containers are safe to restart, and only then issue the update command to swarm?
Thanks in advance!
Docker will not check with a container if it is ready to be stopped, once you give docker the command to stop a container it will perform that action. However it performs the stop in two steps. The first step is a SIGTERM that your container can trap and gracefully handle. By default, after 10 seconds, a SIGKILL is sent that the Linux kernel immediately applies and cannot be trapped by the container. For your goals, you'll want to make sure your app knows when it's safe to exit after receiving the first signal, and you'll probably want to extend the time to much longer than 10 seconds between signals.
The healthcheck won't tell docker that your container is at a safe point to stop. It does tell swarm when your container has finished starting, or when it's misbehaving and needs to be stopped and replaced. The healthcheck defines a command to run inside your container, and the exit code is checked for whether it's 0 (healthy) or 1 (unhealthy). No other exit codes are currently valid.
If you need more than the simple signal handling inside the container, then yes, you're likely moving up the stack to a ci/cd tool to manage the deployment.
I'm setup a database container with a script rc.db which provide standard init commands like:
/etc/rc.db start
/etc/rc.db stop
/etc/rc.db status
In Is it possible to install a complex server inside a Docker container?, I know I could use a simple script to start the db container(for example name as /etc/db_run.sh:
#/bin/sh
/etc/rc.db start
wait
And the Dockerfile
...
RUN /etc/db_run.sh
Because close database correctly is important. I wish when the container be stopped, it could call the /etc/rc.db stop.
When Docker tries to stop a container, it sends a SIGTERM signal, followed by a SIGKILL after a grace period. Just catch this signal and either call your script or pass it onto the DB process, whichever is appropriate.
I suspect that if you make the DB the main process running in the foreground, it will handle the signals correctly itself.