Persist Docker container after running short-lived scripts - docker

I have two Python scripts that I'm running in a container. The first script loads some data from disk, does some manipulation, and then saves the output in the container. The second script does a similar thing, again saving output on the container. However, once these scripts are done running, my container is basically "done" and Kubernetes basically re-deploys the same build, forever. I want to be able to run these scripts once but be able to access those results whenever, without the container continuously being built.
Here's my Dockerfile, generally:
FROM X
...
RUN python3 script1.py
RUN python3 script2.py
Currently I'm trying CMD sleep infinity to try to access the container through the shell later, but that isn't working. I've also tried ENTRYPOINT ["sh"], to no avail.
So generally, the Dockerfile I'm now using looks like this:
FROM X
...
RUN python3 script1.py
RUN python3 script2.py
CMD sleep infinity

In Kubernetes/OpenShift you would use a Job. But to save results you will also need to claim a persistent volume and mount it into the Pod for the Job, giving you a place to save the results. You could create a temporary pod later on to access the results from the persistent volume.

You need to use docker-compose here .What you can do mount a /var/opt/logs from your host container , or whichever directory you want, to the same directory inside container where your logs are stored .Then both directories and files will be synced irrespective of your container is up or down .Thats how i do to mount the scripts , which i need when container is up and running.

Related

Difference between Docker Build and Docker Run

If I want to run a python script in my container what is the point in having the RUN command, if I can pass in an argument at build along with running the script?
Each time I run the container I want x.py to be run on an ENV variable passed in during the build stage.
If I were to use Swarm, and the only goal was to run the x.py script, swarm would only be building nodes, rather than building and eventually running, since the CMD and ENTRYPOINT instructions only happen at run time.
Am I missing something?
The docker build command creates an immutable image. The docker run command creates a container that uses the image as a base filesystem, and other metadata from the image is used as defaults to run that image.
Each RUN line in a Dockerfile is used to add a layer to the image filesystem in docker. Docker actually performs that task in a temporary container, hence the selection of the confusing "run" term. The only thing preserved from that RUN command are the filesystem changes, running processes, changes to environment variables, shell settings like the current working directory, are all lost when the temporary container is cleaned up at the completion of the RUN command.
The ENTRYPOINT and CMD value are used to specify the default command to run when the container is started. When both are defined, the result is the value of the entrypoint is run with the value of the cmd appended as a command line argument. The value of CMD is easily overridden at the end of the docker run command line, so by using both you can get easy to reconfigure containers that run the same command with different user input parameters.
If the command you are trying to run needs to be performed every time the container starts, rather than being stored in the immutable image, then you need to perform that command in your ENTRYPOINT or CMD. This will add to the container startup time, so if the result of that command can be stored as a filesystem change and cached for all future containers being run, you want to make that setting in a RUN line.

docker ubuntu sourceing after starting image

I built myself an image for ROS. I run it while mounting my original home on the host and some tricks to get graphics as well. After starting the shell inside docker I always need to execute two source commands. One of the files to be sourced are actually inside the container, but the other resides in my home, which only gets mounted on starting the container. I would have these two files sourced automatically.
I tried adding
RUN bash -c "source /opt/ros/indigo/setup.bash"
to the image file, but this did not actually source it. Using CMD instead of run didn't drop me into the container's shell (I assume it finished executing source and then exited?). I don't even have an idea how to source the file that is only available after startup. What would I need to do?
TL;DR: you need to perform this step as part of your CMD or ENTRYPOINT, and for something like a source command, you need a step after that in the shell to run your app, or whatever shell you'd like. If you just want a bash shell as your command, then put your source command inside something like your .bashrc file. Or you can run something like:
bash -c "source /opt/ros/indigo/setup.bash && bash"
as your command.
One of the files to be sourced are actually inside the container, but the other resides in my home, which only gets mounted on starting the container.
...
I tried adding ... to the image file
Images are built using temporary containers that only see your Dockerfile instructions and the context sent with that to run the build. Containers use that built image and all of your configuration, like volumes, to run your application. There's a hard divider between those two steps, image build and container run, and your volumes are not available during that image build step.
Each of those RUN steps being performed for the image build are done in a temporary container that only stores the output of the filesystem when it's finished. Changes to your environment, a cd into another directory, spawned processes or services in the background, or anything else not written to the filesystem when the command spawned by RUN exits, will be lost. This is one reason you will see commands chained together in a single long RUN command, and it's why you have ENV and WORKDIR commands in the Dockerfile.

Docker RUN layer has no mounted volumes

In my docker-compose, I am mounting a local folder to a folder for Docker. I can see and use the mounted volume with CMD in the Dockerfile, but not with RUN. RUN seems to be a totally clean layer from the docs. Is there a way to have RUN be able to use mount points specified in the docker-compose file?
You can't mount volumes during the docker build process, regardless of whether you use docker itself, docker-compose, or some other tool. The whole idea is that the build process is supposed to be as indepdent of your environment as possible, so that the resulting images have no dependencies on your local system and can be more easily shared.
There are generally alternate ways of approaching whatever problem you're trying to solve that do not require trying to expose data into your build process.
Actually there's a big difference between CMD and RUN
CMD is used to provide arguments or command which is execute when you start the container
RUN is used to provide a command to execute to create a new layer.
In short: volumes are not available during build step (when RUN is executed).
Docker containers have two ways of providing "external" files:
In build step, context is passed.
In run step, container layer + volumes are used.
See for CMD:
https://docs.docker.com/engine/reference/builder/#cmd
https://docs.docker.com/engine/reference/builder/#entrypoint
See for RUN:
https://docs.docker.com/engine/reference/builder/#run
For context see:
https://docs.docker.com/engine/reference/builder/#usage
(docker-compose) https://docs.docker.com/compose/compose-file/#build

How to keep changes inside a container on the host after a docker build?

I have a docker-compose dev stack. When I run, docker-compose up --build, the container will be built and it will execute
Dockerfile:
RUN composer install --quiet
That command will write a bunch of files inside the ./vendor/ directory, which is then only available inside the container, as expected. The also existing vendor/ on the host is not touched and, hence, out of date.
Since I use that container for development and want my changes to be available, I mount the current directory inside the container as a volume:
docker-compose.yml:
my-app:
volumes:
- ./:/var/www/myapp/
This loads an outdated vendor directory into my container; forcing me to rerun composer install either on the host or inside the container in order to have the up to date version.
I wonder how I could manage my docker-compose stack differently, so that the changes during the docker build on the current folder are also persisted on the host directory and I don't have to run the command twice.
I do want to keep the vendor folder mounted, as some vendors are my own and I like being able to modifiy them in my current project. So only mounting the folders I need to run my application would not be the best solution.
I am looking for a way to tell docker-compose: Write all the stuff inside the container back to the host before adding the volume.
You can run a short side container after docker-compose build:
docker run --rm -v /vendor:/target my-app cp -a vendor/. /target/.
The cp could also be something more efficient like an rsync. Then after that container exits, you do your docker-compose up which mounts /vendor from the host.
Write all the stuff inside the container back to the host before adding the volume.
There isn't any way to do this directly, but there are a few options to do it as a second command.
as already suggested you can run a container and copy or rsync the files
use docker cp to copy the files out of a container (without using a volume)
use a tool like dobi (disclaimer: dobi is my own project) to automate these tasks. You can use one image to update vendor, and another image to run the application. That way updates are done on the host, but can be built into the final image. dobi takes care of skipping unnecessary operations when the artifact is still fresh (based on modified time of files or resources), so you never run unnecessary operations.

Sharing files between container and host

I'm running a docker container with a volume /var/my_folder. The data there is persistent: When I close the container it is still there.
But also want to have the data available on my host, because I want to work on code with an IDE, which is not installed in my container.
So how can I have a folder /var/my_folder on my host machine which is also available in my container?
I'm working on Linux Mint.
I appreciate your help.
Thanks. :)
Link : Manage data in containers
The basic run command you want is ...
docker run -dt --name containerName -v /path/on/host:/path/in/container
The problem is that mounting the volume will, (for your purposes), overwrite the volume in the container
the best way to overcome this is to create the files (inside the container) that you want to share AFTER mounting.
The ENTRYPOINT command is executed on docker run. Therefore, if your files are generated as part of your entrypoint script AND not as part of your build THEN they will be available from the host machine once mounted.
The solution is therefore, to run the commands that creates the files in the ENTRYPOINT script.
Failing this, during the build copy the files to another directory and then COPY them back in your ENTRYPOINT script.

Resources