If there is a simple run command specified on the command line, or with CMD, the container stops when the program exits. But, what if:
the program spawns new processes, ant then exits?
'exec' is used on the command line, then the first command exits?
Can you please also point to the docs?
Thanks!
The process you run when you exec docker run will be the process with PID 1 (inside the process namespace of the container). This process is special in UNIX / Linux systems and it's the process in charge of 'adopting' any 'orphaned' process. If this process ends, all the processes will end also.
So, answering your questions, if this initial process (the one executed in docker run) ends, all the processes inside your container will also end . I have not found any official documentation related to this, but there is a great post from phusion discussing about this topic.
Related
I built an image, say my_dbt_image, with a custom install of a program, say dbt. I also wrote a little binary my_dbt like:
#!/usr/bin/env bash
# define flags, e.g. '--workdir=/vol' '--volume $PWD:/vol'
docker run --rm $flags my_dbt_image dbt $#
so that when a user enters my_dbt <args> in their terminal, the script actually runs dbt <args> inside the container. Hope this makes sense.
Seems to work fine, maybe a bit slow. I figure that to speed things up, instead of running a new container every time the user enters a new command, I should reuse the same container, leveraging docker exec.
Currently, after the command is run, the container goes in stopped state (status exited). This makes sense. I'm a bit confused about the logic of docker exec. Why does the container need to be running in order to throw a new command at it?
In my situation, do you think I should:
stop the container after each user command is executed, and (re)start it when a new user command is entered, or
keep the container running?
Any comments or advice on my approach are welcome.
I have looked for a bit on Stack Overflow for a way to have a container start up and wait for an external connection but have not seen anything.
Here is what my process looks like currently:
Non-Docker external process reaches out at X interval and tells system to run a command.
Command runs.
System should remain idle until the next interval.
Now I have seen a few options with --wait or sleep but I would think that would not allow the container to receive the connection.
I also looked at the wait for container script that is often recommended but in this case I need the container to wait for a script to call it on non defined intervals.
I have tried having this just run the help command for my process but it then fails the container after a bit of time and makes it a mess for finding anything.
Additionally I have tried to have the container start with no command just to run the base OS and wait for the call but that did not work either.
I was looking at this wrong.
Ended up just running like any other webserver and database server.
I have a Docker image that needs to be run in an environment where I have no admin privileges, using Slurm 17.11.8 in RHEL. I am using udocker to run the container.
In this container, there are two applications that needs to run:
[1] ROS simulation (there is a rosnode that is a TCP client talking to [2])
[2] An executable (TCP server)
So [1] and [2] needs to run together and they shared some common files as well. Usually, I run them in separate terminals. But I have no idea how to do this with SLURM.
Possible Solution:
(A) Use two containers of the same image, but their files will be stored locally. Could use volumes instead. But this requires me to change my code significantly and maybe break compatibility when I am not running it as containers (e.g in Eclipse).
(B) Use a bash script to launch two terminals and run [1] and [2]. Then srun this script.
I am looking at (B) but have no idea how to approach it. I looked into other approaches but they address sequential executions of multiple processes. I need these to be concurrent.
If it helps, I am using xfce-terminal though I can switch to other terminals such as Gnome, Konsole.
This is a shot in the dark since I don't work with udocker.
In your slurm submit script, to be submitted with sbatch, you could allocate enough resources for both jobs to run on the same node(so you just need to reference localhost for your client/server). Start your first process in the background with something like:
udocker container_name container_args &
The & should start the first container in the background.
You would then start the second container:
udocker 2nd_container_name more_args
This would run without & to keep the process in the foreground. Ideally, when the second container completes the script would complete and slurm cleanup would kill the first container. If both containers will come to an end cleanly you can put a wait at the end of the script.
Caveats:
Depending on how Slurm is configured, processes may not be properly cleaned up at the end. You may need to capture the PID of the first udocker as a variable and kill it before you exit.
The first container may still be processing when the second completes. You may need to add a sleep command at the end of your submission script to give it time to finish.
Any number of other gotchas may exist that you will need to find and hopefully work around.
I need to configure a program running in a docker container. To achieve that the program must be running (and provide an open port) so that the administration program can connect to the running process. Unfortunately there is no simple editable config file so this is the only way. The RUN command is obviously not the right one because it does not provide a running instance after docker went to the next command. The best way would be doing this while building the docker image but if it has to be done during container start it would be OK as well. But there is (as far as I know) also no easy way to run multiple commands on startup. Does anyone has an idea how to do that?
To make it a bit more clear, here is a simple example from my Dockerfile:
# this command should start the application which has to be configured
RUN /usr/local/server/server.sh
# I tried this command alternatively because the shell script is blocking
RUN nohup /usr/local/server/server.sh &
# this is the command which starts an administration program which connects to the running instance started above
RUN /usr/local/administration/adm [some configuration parameters...]
# afterwards the server process can be stopped
Downloading the complete program directory containing the correct state could be a solution, too. But then the configuration cannot changed easily in the Dockerfile, what would be great.
A Dockerfile is supposed to be a sequential list of instructions to produce an image. The image should contain your application's code, and all of its installable dependencies.
Each RUN instruction gets executed as its own container. Once the command that you run completes, any changed files get committed as a new image layer.
Trying to run a process in the background, will cause the command you are running to return immediately. Once that happens, the container is considered stopped, and the Dockerfile's next instruction will be executed in a new separate container.
If you really need two processes running, you will need to produce a command that you can pass to a single RUN instruction.
I'm running into an issue where my docker container will exit with exit code 137 after ~a day of running. The logs for the container contains no information indicating that an error code has occurred. Additionally, attempts to restart the container returns an error that the PID already exists for the application.
The container is built using the sbt docker plugin, sbt docker:publishLocal and is then run using
docker run --name=the_app --net=the_app_nw -d the_app:1.0-SNAPSHOT.
I'm also running 3 other docker containers which all together do use 90% of the available memory, but its only ever that particular container which exits.
Looking for any advice on to where to look next.
The error code 137 (128+9) means that it was killed (like kill -9 yourApp) by something. That something can be a lot of things (maybe it was killed because was using too much resources by docker or something else, maybe it got out of memory, etc)
Regarding the pid problem, you can add to your build.sbt this
javaOptions in Universal ++= Seq(
"-Dpidfile.path=/dev/null"
)
Basically this should instruct Play to not create a RUNNING_PID file. If it does not work you can try to pass that option directly in Docker using the JAVA_OPTS env variable.