I have a docker container running a 10 hour luigi task. I want to pause the container to use my laptop for something else. I tried "docker pause" but when I unpause the luigi scheduler shows no tasks running. So I have to start again.
Is there any way I can pause and restart exactly where I left off? I suspect it may be luigi that is deleting the task.
Related
I am trying to understand what is the difference between the commands docker stop ContainerID and docker pause ContainerID. According to this page both of them are used to pause an existing Docker container.
The docker pause command suspends all processes in the specified containers. On Linux, this uses the cgroups freezer. Traditionally, when suspending a process the SIGSTOP signal is used, which is observable by the process being suspended
https://docs.docker.com/engine/reference/commandline/pause/
The docker stop command. The main process inside the container will receive SIGTERM, and after a grace period, SIGKILL.
https://docs.docker.com/engine/reference/commandline/stop/#options
SIGTERM is the termination signal. The default behavior is to terminate the process, but it also can be caught or ignored. The intention is to kill the process, gracefully or not, but to first allow it a chance to cleanup.
SIGKILL is the kill signal. The only behavior is to kill the process, immediately. As the process cannot catch the signal, it cannot cleanup, and thus this is a signal of last resort.
SIGSTOP is the pause signal. The only behavior is to pause the process; the signal cannot be caught or ignored. The shell uses pausing (and its counterpart, resuming via SIGCONT) to implement job control.
And addition to the answers added earlier
running docker events after docker stop shows events
kill (signal 15): where signal 15 = SIGTERM
die
stop
running docker events after docker pause shows only one event
pause
Also docker pause would still keep memory portion while the container is paused. This memory is used when the container is resumed. docker stop releases the memory used after the container is stopped.
This table has even more details.
What is the difference between docker stop and docker pause?
docker stop: Send SIGTERM(termination signal), and if needed SIGKILL(kill signal)
docker pause: Send SIGSTOP(pause signal)
SIGTERM: The default behavior is to terminate the process, but it also can be caught or ignored. The intention is to kill the process,
gracefully or not, but first give it a chance to clean up.
SIGKILL: The only behavior is to kill the process, immediately. As the process cannot catch the signal, it cannot clean up; thus, this is
a signal of last resort.
SIGSTOP: The only behavior is to pause the process; the signal cannot be caught or ignored. The shell uses pausing (and its
counterpart, resuming via SIGCONT) to implement job control
When to use docker stop and docker pause?
docker stop: When you wish to clear up memory or delete all of the processes' -cached- data. Simply put, you no longer care about the processes in the container and are comfortable with killing them.
docker pause: When you only want to suspend the processes in the container; you do not want them to lose data or state.
Example:
Consider a container with a counter. Assume the counter has reached 3000. Running docker stop will cause the counter to lose its value and you will be unable to retrieve it. Using docker pause, on the other hand, will maintain the counter state and value.
Hope it's clear now!
docker pause pauses (i.e., sends SIGSTOP) pauses (read: suspends) all the processes in a container[s].
docker stop stops (i.e., sends SIGTERM, and if needed SIGKILL) to the conrainer[s]'s main process.
When the running container is issued with the docker pause command, the SIGSTOP signal is passed which allows the processes inside the container (basically the container itself) to be in a paused state.
So when the docker unpause is issued, SIGCONT signal is passed to the container processes to restore the container proceses.
When the docker stop command is issued to the running container, the SIGTERM signal is passed to the container processes to stop and stops the container.
Hence when the docker pause is issued to a container, and the docker service is restarted, the cgroups allocated to it is released. (as the SIGTERM is passed to all the container processes)
So after the restart, the unpause would not be helpful as the containers are stopped.
I like to use Jupyter Notebook. If I run it in a VM in virtualbox, I can save the state of the VM, and then pick up right where I left off the next day. Can I do something similar if I were to run it in a docker container? i.e. dump the "state" of the container to disk, then crank it back up and reload the "state"?
It looks like docker checkpoint may be the thing I'm attempting to accomplish here. There's not much in the docs that describes it as such. In fact, the docs for docker checkpoint say "Manage checkpoints" which is massively unhelpful.
UPDATE: This IS, in fact, what docker checkpoint is supposed to accomplish. When I checkpoint my jupyter notebook container, it saves it, I can start it back up with docker start --checkpoint [my_checkpoint] jupyter_notebook, and it shows the things I had running as being in a Running state. However, attempts to then use the Running notebooks fail. I'm not sure if this is a CRIU issue or a Jupyter issue, but I'll bring it up in the appropriate git issue tracker.
Anyhoo docker checkpoint is the thing that is supposed to provide VM-save-state/hibernate style functionality.
The closest approach I can see is docker pause <container-id>
https://docs.docker.com/engine/reference/commandline/pause/
The docker pause command suspends all processes in the specified containers. On Linux, this uses the cgroups freezer. Traditionally, when suspending a process the SIGSTOP signal is used, which is observable by the process being suspended. With the cgroups freezer the process is unaware, and unable to capture, that it is being suspended, and subsequently resumed.
Take into account as an important difference against VirtualBox hibernation, that there is no disk persistence of the memory state of the containerized process.
If you just stop the container, it hibernates:
docker stop myjupyter
(hours pass)
docker start myjupyter
docker attach myjupyter
I do this all the time, especially with docker containers which have web browers in them.
Quite often, when I start my docker-composed app, I like to check that everything started correctly and everything's fine.
So I do docker-compose up, look at the logs, and then I have to do docker-compose stop, and docker-compose -d up.
Those are too many steps and having to stop the container means downtime on my server.
Ain't there a way to send docker to the background?
I tried Ctrl+Z but then if I try to exit the ssh session, I get There are stopped jobs., so that's not the correct way to do this.
I use docker-compose, but I'd be curious if this is possible with docker also.
Thanks
After Ctrl+z, just use bg, the task will start running on background and you are safe to close the ssh session.
In my application consisting of several containers, I pause containers which are currently not needed. When they are needed again, I unpause them. This works fine.
However, if something goes wrong in one of the running containers(container exits with exit code != 0), docker-compose(which I am also using) tries to stop all the other containers. If a container is paused, it cannot be stopped or killed.
A small example to illustrate what happens. (all of these commands are automated in my case)
docker start cd1d8ad01f56
docker pause cd1d8ad01f56
docker stop cd1d8ad01f56
Error response from daemon: Cannot stop container cd1d8ad01f56:
Container cd1d8ad01f56c695a598e168e2eacdcd20a5231b9240029db1579bc0f1dcb903
is paused. Unpause the container before stopping
Error: failed to stop containers: [cd1d8ad01f56]
I want the containers to be stopped, even if they are paused.
Solutions I thought of:
First unpause every sleeped container, then stop or kill it. This is an unsuitable solution that requires manual work. But it works...
I could write a script that looks for paused containers and then unpauses and kills them. But I want for compose to just kill all the other stuff and be done with it. I do not want to have to issue another command to execute my script.
Is there a way to specify the code of a container that exits(i.e. tell it to unpause other containers)? So that the containers are not sleeped when trying to stop them.
First unpausing every sleeped container, to then stop or kill it is tedious work, which I would like to automate. I am working in a test environment and do not care how the containers shutdown. I just want them to end together with the failed container(s).
I can suspend the processes running inside a container with the PAUSE command. Is it possible to clone the Docker container whilst paused, so that it can be resumed (i.e. via the UNPAUSE command) several times in parallel?
The use case for this is a process which takes long time to start (i.e. ~20 seconds). Given that I want to have a pool of short-living Docker containers running that process in parallel, I would reduce start-up time for each container a lot if this was somehow possible.
No, you can only clone the container's disk image, not any running processes.
Yes, you can, using docker checkpoint (criu). This does not have anything to do with pause though, it is a seperate docker command.
Also see here.