How to test a Dockerfile with minimal overhead - docker

I'm trying to learn how to write a Dockerfile. Currently my strategy is as follows:
Guess what commands are correct to write based documentation.
Run sudo docker-compose up --build -d to build a docker container
Wait ~5 minutes for my anaconda packages to install
Find that I made a mistake on step 15, and go back to step 1.
Is there a way to interactively enter the commands for a Dockerfile, or to cache the first 14 successful steps so I don't need to rebuild the entire file? I saw something about docker exec but it seems that's only for running containers. I also want to try and use the same syntax as I use in the dockerfile (i.e. ENTRYPOINT and ENV) since I'm not sure what the bash equivalent is/if it exists.

you can run docker-compose without the --build flag that way you don't have to rebuild the image every time, although as you are testing the Dockerfile i don't know if you have much options here; the docker should cache automatically the builds but only if there's no changes from the last time that you made a build, and there's no way to build a image interactively, docker doesn't work like that, lastly, the docker exec is just to run commands inside the container that was created from the build.
some references for you: docker cache, docker file best practices

Related

How to run a tensorflow/tfx container?

I am new to docker, and have downloaded the tfx image using
docker pull tensorflow/tfx
However, I am unable to find anywhere how to successfully launch a container for the same.
here's a naive attempt
You can use docker image ls to get a list of locally-built Docker images. Note that an "image" is a template for a VM.
To instantiate the VM and shell into it, you would use a command like docker run -t --entrypoint bash tensorflow/tfx. This spins up a temporary VM based on the tensorflow/tfx image.
By default, Docker assumes you want the latest version of that image stored on your local machine, i.e. tensorflow/tfx:latest in the list. If you want to change it, you can reference a specific image version by name or hash, e.g. docker run -t --entrypoint bash tensorflow/tfx:1.0.0 or docker run -t --entrypoint bash fe507176d0e6. I typically use the docker image ls command first and cut & paste the hash, so my notes can be specific about which build I'm referencing even if I later edit the relevant Dockerfile.
Also note that changes you make inside that VM will not be saved. Once you exit the bash shell, it goes away. The shell is useful for checking the state & file structure of a constructed image. If you want to edit the image itself, use a Dockerfile. Each line of a Dockerfile creates a new image when the Dockerfile is compiled. If you know that something went wrong between lines 5 and 10 of the Dockerfile, you can potentially shell into each of those images in turn (with the docker run command I gave above) to see what went wrong. Kinda tedious, but it works.
Also note that docker run is not equivalent to running a TFX pipeline. For the latter, you want to look into the TFX CLI commands or otherwise compile the pipeline - and probably upload it to an external Kubeflow server.
Also note that the Docker image is just a starting point for one piece of your TFX pipeline. A full pipeline will require you to specify the components you want, a more-complete Dockerfile, and more. That's a huge topic, and IMO, the existing documentation leaves a lot to be desired. The Dockerfile you create describes the image which will be distributed to each of the workers which process the full pipeline. It's the place to specify dependencies, necessary files, and other custom setup for the machine. Most ML-relevant concerns are handled in other files.

How can I update Docker image after changing few lines of code in my app?

Dockerfile I just built a Docker image using the below command in my app working directory:
docker build -t imagename:latest .
The Docker image is successfully built after a few minutes and the application is running as well once I used the below command:
docker run -p portnumber:portnumber imagename:latest
But now I want to update 2 lines of code in my application codebase. Suppose I added the code and wants to see if my application is working or not so how could I do that? Do I need to follow the below steps?
1. Delete the Docker image
2. Rebuild the image using the above command
3. See if the app is working or not using the "docker run" command?
I want to know that how can I update my Docker image? My Dockerfile is the same and there won't be any changes. I don't want to rebuild the whole Docker image again because initially, the size of all packages were around 2GB. Can anyone help me that what should I do next? Thanks in advance.
OS: Ubuntu
Application framework: Streamlit
Although you asked specifically how to update (rebuild) your docker image, it is my guess that you are in fact in need of a different solution.
If you are developing on a dockerized version of your application (which is good), it is impractical to rebuild the image with every change you do in your code.
A better, and more common approach, is to mount your local folder into the container, so the running container and your local machine actually share a folder.
This way you can just edit your code, and it is reflected in the container immediately.
So, your docker run command might look something like this:
$ docker run -v $PWD:/path/to/app/in/container -p PORT:PORT IMAGE_NAME
Read more about docker volumes.
Read more about docker for development environments.
Read about using docker-compose for development.
Rebuilding your docker image might not be as much as a hassle as you think !
When you build an image, each line with the command RUN, COPY or ADD of your Dockerfile is made into a layer of your image. When you rebuild the image, only the updated lines of the Dockerfile should rebuild. If you do not delete the old image so that it's in cache, that is.
If you try it, you should see only one or so layers of your image updating (and those below it)
An alternative would be to not put your code into your build and to insert it in the container with a volume at runtime. Depending on your use, it could be something. But it is a quite different use case and might not apply.

Use Docker to Distribute CLI Application

I'm somewhat new to Docker. I would like to be able to use Docker to distribute a CLI program, but run the program normally once it has been installed. To be specific, after running docker build on the system, I need to be able to simply run my-program in the terminal, not docker run my-program. How can I do this?
I tried something with a Makefile which runs docker build -t my-program . and then writes a shell script to ~/.local/bin/ called my-program that runs docker run my-program, but this adds another container every time I run the script.
EDIT: I realize is the expected behavior of docker run, but it does not work for my use-case.
Any help is greatly appreciated!
If you want to keep your script, add the remove flag --rm to the docker run command. The remove flag removes the container automatically after the entry-point process has exit.
Additionally, I would personally prefer an alias for this. Simply add something like this example alias my-program="docker run --rm my-program" to your ~/.bashrc or ~/.zshrc file. This even has the advantage that all other parameters after the alias (my-program param1 param2) are automatically forwarded to the entry-point of your image without any additional effort.

Is there a point in Docker start?

So, is there a point in the command "start"? like in "docker start -i albineContainer".
If I do this, I can't really do anything with the albine inside the container, I would have to do a run and create another container with the "-it" command and "sh" after (or "/bin/bash", don't remember it correctly right now).
Is that how it will go most of the times? delete and rebuilt containers and do the command "-it" if you want to do stuff in them? or would it more depend on the Dockerfile, how you define the cmd.
New to Docker in general and trying to understand the basics on how to use it. Thanks for the help.
Running docker run/exec with -it means you run the docker container and attach an interactive terminal to it.
Note that you can also run docker applications without attaching to them, and they will still run in the background.
Docker allows you to run a program (which can be bash, but does not have to be) in an isolated environment.
For example, try running the jenkins docker image: https://hub.docker.com/_/jenkins.
this will create a container, without you having attach to it, and you would still be able to use it.
You can also attach to an existing, running container by using docker exec -it [container_name] bash.
You can also use docker logs to peek at the stdout of a certain docker container, without actually attaching to its shell interactively.
You almost never use docker start. It's only possible to use it in two unusual circumstances:
If you've created a container with docker create, then docker start will run the process you named there. (But it's much more common to use docker run to do both things together.)
If you've stopped a container with docker stop, docker start will run its process again. (But typically you'll want to docker rm the container once you've stopped it.)
Your question and other comments hint at using an interactive shell in an unmodified Alpine container. Neither is a typical practice. Usually you'll take some complete application and its dependencies and package it into an image, and docker run will run that complete packaged application. Tutorials like Docker's Build and run your image go through this workflow in reasonable detail.
My general day-to-day workflow involves building and testing a program outside of Docker. Once I believe it works, then I run docker build and docker run, and docker rm the container once I'm done. I rarely run docker exec: it is a useful debugging tool but not the standard way to interact with a process. docker start isn't something I really ever run.

How to get docker-compose to always re-create containers from fresh images?

My docker images are built on a Jenkins CI server and are pushed to our private Docker Registry. My goal is to provision environments with docker-compose which always start the originally built state of the images.
I am currently using docker-compose 1.3.2 as well as 1.4.0 on different machines but we also used older versions previously.
I always used the docker-compose pull && docker-compose up -d commands to fetch the fresh images from the registry and start them up. I believe my preferred behaviour was working as expected up to a certain point in time, but since then docker-compose up started to re-run previously stopped containers instead of starting the originally built images every time.
Is there a way to get rid of this behaviour? Could that way be one which is wired in the docker-compose.yml configuration file to not depend "not forgetting" something on the command line upon every invocation?
ps. Besides finding a way to achieve my goal, I would also love to know a bit more about the background of this behaviour. I think the basic idea of Docker is to build an immutable infrastructure. The current behaviour of docker-compose just seem to plain clash with this approach.. or do I miss some points here?
docker-compose up --force-recreate is one option, but if you're using it for CI, I would start the build with docker-compose rm -f to stop and remove the containers and volumes (then follow it with pull and up).
This is what I use:
docker-compose rm -f
docker-compose pull
docker-compose up --build -d
# Run some tests
./tests
docker-compose stop -t 1
The reason containers are recreated is to preserve any data volumes that might be used (and it also happens to make up a lot faster).
If you're doing CI you don't want that, so just removing everything should get you want you want.
Update: use up --build which was added in docker-compose 1.7
The only solution that worked for me was the --no-cache flag:
docker-compose build --no-cache
This will automatically pull a fresh image from the repo. It also won't use the cached version that is prebuilt with any parameters you've been using before.
By current official documentation there is a shortcut that stops and removes containers, networks, volumes, and images created by up, if they are already stopped or partially removed and so on, then it will do the trick too:
docker-compose down
Then if you have new changes on your images or Dockerfiles use:
docker-compose build --no-cache
Finally:docker-compose up
In one command: docker-compose down && docker-compose build --no-cache && docker-compose up
docker-compose up --build # still use image cache
OR
docker-compose build --no-cache # never use cache
You can pass --force-recreate to docker compose up, which should use fresh containers.
I think the reasoning behind reusing containers is to preserve any changes during development. Note that Compose does something similar with volumes, which will also persist between container recreation (a recreated container will attach to its predecessor's volumes). This can be helpful, for example, if you have a Redis container used as a cache and you don't want to lose the cache each time you make a small change. At other times it's just confusing.
I don't believe there is any way you can force this from the Compose file.
Arguably it does clash with immutable infrastructure principles. The counter-argument is probably that you don't use Compose in production (yet). Also, I'm not sure I agree that immutable infra is the basic idea of Docker, although it's certainly a good use case/selling point.
docker-compose up --build --force-recreate
I claimed 3.5gb space in ubuntu AWS through this.
clean docker
docker stop $(docker ps -qa) && docker system prune -af --volumes
build again
docker build .
docker-compose build
docker-compose up
Also if the compose has several services and we only want to force build one of those:
docker-compose build --no-cache <service>
together with --force-recreate,
you might want to consider using this flag too:
-V, --renew-anon-volumes Recreate anonymous volumes instead of retrieving
data from the previous containers.
I'm not sure from which version this flag is available, so check your docker-compose up --help if you have it or not
$docker-compose build
If there is something new it will be rebuilt.

Resources