My web-application uses graphicsmagick to resize images. Resizing an image will usually take about 500ms. To simpilfy the setup and wall it off I wanted to move the graphicsmagick call inside a docker container and use docker run to execute it. However, running inside a container adds an additional ~300ms, which is not really acceptable for my use case.
To reduce the overhead of starting a container one could run an endless program (sth. like docker run tail -f /dev/null) and then use docker exec to execute the actual call to graphicsmagick inside the running container. However this seems like a big hack.
How would one fix this problem "correctly" with docker or is it just not the right fit here?
This sounds like a good candidate for a microservice, a long-lived server (implemented in your language of choice) listening on a port for resizing requests that uses graphicsmagick under the hood.
Pros:
Implementation details of the resizer are kept inside the container.
Scalability - you can spin up more containers and put them behind a load balancer.
Cons:
You have to implement a server
Pro or Con:
You've started down the road to a microservices architecture.
Using docker exec may be possible, but it's really not intended to be used that way. Cut out the middleman and talk to your container directly.
The best ways is:
create custom deb/rpm package
use one from system repos
But if you like more docker approach, the best visible way is to (example below):
Start "daemon":
docker run --name ubuntu -d -v /path/to/host/folder:/path/to/guest/folder ubuntu:14.04 sleep infinity
Execute command:
docker exec ubuntu <any needed command>
Where:
"ubuntu" - name of the container
-d - de-attach container
-v - mount volume host -> container
sleep infinity - do nothing, used as entry point and is a way better than read operation.
Use mounted volume in case, if you working with files, if not - do not mount volume and just use pipes.
Related
I am trying Docker for the first time and do not yet have a "mental model". Total beginner.
All the examples that I am looking at have included the --rm flag to run, such as
docker run -it --rm ...
docker container run -it --rm ...
Question:
Why do these commands include the --rm flag? I would think that if I were to go through the trouble of setting up or downloading a container with the good stuff in it, why remove it? I want to keep it to use again.
So, I know I have the wrong idea of Docker.
Containers are merely an instance of the image you use to run them.
The state of mind when creating a containerized app is not by taking a fresh, clean ubuntu container for instance, and downloading the apps and configurations you wish to have in it, and then let it run.
You should treat the container as an instance of your application, but your application is embedded into an image.
The proper usage would be creating a custom image, where you embed all your files, configurations, environment variables etc, into the image. Read more about Dockerfile and how it is done here
Once you did that, you have an image that contains everything, and in order to use your application, you just run the image with proper port settings or other dynamic variables, using docker run <your-image>
Running containers with --rm flag is good for those containers that you use for very short while just to accomplish something, e.g., compile your application inside a container, or just testing something that it works, and then you are know it's a short lived container and you tell your Docker daemon that once it's done running, erase everything related to it and save the disk space.
The flag --rm is used when you need the container to be deleted after the task for it is complete.
This is suitable for small testing or POC purposes and saves the headache for house keeping.
From https://docs.docker.com/engine/reference/run/#clean-up---rm
By default a container’s file system persists even after the container exits. This makes debugging a lot easier (since you can inspect the final state) and you retain all your data by default. But if you are running short-term foreground processes, these container file systems can really pile up. If instead you’d like Docker to automatically clean up the container and remove the file system when the container exits, you can add the --rm flag
In short, it's useful to keep the host clean from stopped and unused containers.
When you run a container from an image using a simple command like (docker run -it ubuntu), it spins up a container. You attach to your container using docker attach container-name (or using exec for different session).
So, when you're within your container and working on it and you type exit or ctrl+z or any other way to come out of the container, other than ctrl+p+q, your container exits. That means that your container has stopped, but it is still available on your disk and you can start it again with : docker start container-name/ID.
But when you run the container with —rm tag, on exit, the container is deleted permanently.
I use --rm when connecting to running containers to perform some actions such as database backup or file copy. Here is an example:
docker run -v $(pwd):/mnt --link app_postgres_1:pg --rm postgres:9.5 pg_dump -U postgres -h pg -f /mnt/docker_pg.dump1 app_db
The above will connect a running container named 'app_postgres_1' and create a backup. Once the backup command completes, the container is fully deleted.
The "docker run rm " command makes us run a new container and later when our work is completed then it is deleted by saving the disk space.
The important thing to note is, the container is just like a class instance and not for data storage. We better delete them once the work is complete. When we start again, it starts fresh.
The question comes then If the container is deleted then what about the data in a container? The data is actually saved in the local system and get linked to it when the container is started. The concept is named as "Volume or shared volume".
From the docker philosophy's point of view it is more advisable:
create a container every time we need to use a certain environment and then remove it after use (docker run <image> all the time); or
create a container for a specific environment (docker run <image>), stop it when it is not necessary and whenever it is initialized again (docker start <container>);
If you docker rm the old container and docker run a new one, you will always get a clean filesystem that starts from exactly what's in the original image (plus any volume mounts). You will also fairly routinely need to delete and recreate a container to change basic options: if you need to change a port mapping or an environment variable, or if you need to update the image to have a newer version of the software, you'll be forced to delete the container.
This is enough reason for me to make my standard process be to always delete and recreate the container.
# docker build -t the-image . # can be done first if needed
docker stop the-container # so it can cleanly shut down and be removed
docker rm the-container
docker run --name the-container ... the-image
Other orchestrators like Docker Compose and Kubernetes are also set up to automatically delete and recreate the container (or Kubernetes pod) if there's a change; their standard workflows do not generally involve restarting containers in-place.
I almost never use docker start. In a Compose-based workflow I generally use only docker-compose up -d, letting it restart things if needed; docker-compose down if I need the CPU/memory resources the container stack was using but not in routine work.
I'm talking with regards to my experience in the industry so take my answer with a grain of salt, because there might be no hard evidence or reference to the theory.
Here's the answer:
TL;DR:
In short, you never need the docker stop and docker start because taking this approach is unreliable and you might lose the container and all the data inside if no proper action is applied beforehand.
Long answer:
You should only work with images and not the containers. Whenever you need some specific data or you need the image to have some customization, you better use docker save to have the image for future use.
If you're just testing out on your local machine, or in your dev virtual machine on a remote host, you're free to use either one you like. I personally take each of the approaches on different scenarios.
But if you're talking about a production environment, you'd better use some orchestration tool; it could be as simple and easy to work with as docker-compose or docker swarm or even Kubernetes on more complex environments.
You better not take the second approach (docker run, docker stop & docker start) in those environments because at any moment in time you might lose that container and if you are solely dependent on that specific container or it's data, then you're gonna have a bad weekend.
So, is there a point in the command "start"? like in "docker start -i albineContainer".
If I do this, I can't really do anything with the albine inside the container, I would have to do a run and create another container with the "-it" command and "sh" after (or "/bin/bash", don't remember it correctly right now).
Is that how it will go most of the times? delete and rebuilt containers and do the command "-it" if you want to do stuff in them? or would it more depend on the Dockerfile, how you define the cmd.
New to Docker in general and trying to understand the basics on how to use it. Thanks for the help.
Running docker run/exec with -it means you run the docker container and attach an interactive terminal to it.
Note that you can also run docker applications without attaching to them, and they will still run in the background.
Docker allows you to run a program (which can be bash, but does not have to be) in an isolated environment.
For example, try running the jenkins docker image: https://hub.docker.com/_/jenkins.
this will create a container, without you having attach to it, and you would still be able to use it.
You can also attach to an existing, running container by using docker exec -it [container_name] bash.
You can also use docker logs to peek at the stdout of a certain docker container, without actually attaching to its shell interactively.
You almost never use docker start. It's only possible to use it in two unusual circumstances:
If you've created a container with docker create, then docker start will run the process you named there. (But it's much more common to use docker run to do both things together.)
If you've stopped a container with docker stop, docker start will run its process again. (But typically you'll want to docker rm the container once you've stopped it.)
Your question and other comments hint at using an interactive shell in an unmodified Alpine container. Neither is a typical practice. Usually you'll take some complete application and its dependencies and package it into an image, and docker run will run that complete packaged application. Tutorials like Docker's Build and run your image go through this workflow in reasonable detail.
My general day-to-day workflow involves building and testing a program outside of Docker. Once I believe it works, then I run docker build and docker run, and docker rm the container once I'm done. I rarely run docker exec: it is a useful debugging tool but not the standard way to interact with a process. docker start isn't something I really ever run.
I am trying Docker for the first time and do not yet have a "mental model". Total beginner.
All the examples that I am looking at have included the --rm flag to run, such as
docker run -it --rm ...
docker container run -it --rm ...
Question:
Why do these commands include the --rm flag? I would think that if I were to go through the trouble of setting up or downloading a container with the good stuff in it, why remove it? I want to keep it to use again.
So, I know I have the wrong idea of Docker.
Containers are merely an instance of the image you use to run them.
The state of mind when creating a containerized app is not by taking a fresh, clean ubuntu container for instance, and downloading the apps and configurations you wish to have in it, and then let it run.
You should treat the container as an instance of your application, but your application is embedded into an image.
The proper usage would be creating a custom image, where you embed all your files, configurations, environment variables etc, into the image. Read more about Dockerfile and how it is done here
Once you did that, you have an image that contains everything, and in order to use your application, you just run the image with proper port settings or other dynamic variables, using docker run <your-image>
Running containers with --rm flag is good for those containers that you use for very short while just to accomplish something, e.g., compile your application inside a container, or just testing something that it works, and then you are know it's a short lived container and you tell your Docker daemon that once it's done running, erase everything related to it and save the disk space.
The flag --rm is used when you need the container to be deleted after the task for it is complete.
This is suitable for small testing or POC purposes and saves the headache for house keeping.
From https://docs.docker.com/engine/reference/run/#clean-up---rm
By default a container’s file system persists even after the container exits. This makes debugging a lot easier (since you can inspect the final state) and you retain all your data by default. But if you are running short-term foreground processes, these container file systems can really pile up. If instead you’d like Docker to automatically clean up the container and remove the file system when the container exits, you can add the --rm flag
In short, it's useful to keep the host clean from stopped and unused containers.
When you run a container from an image using a simple command like (docker run -it ubuntu), it spins up a container. You attach to your container using docker attach container-name (or using exec for different session).
So, when you're within your container and working on it and you type exit or ctrl+z or any other way to come out of the container, other than ctrl+p+q, your container exits. That means that your container has stopped, but it is still available on your disk and you can start it again with : docker start container-name/ID.
But when you run the container with —rm tag, on exit, the container is deleted permanently.
I use --rm when connecting to running containers to perform some actions such as database backup or file copy. Here is an example:
docker run -v $(pwd):/mnt --link app_postgres_1:pg --rm postgres:9.5 pg_dump -U postgres -h pg -f /mnt/docker_pg.dump1 app_db
The above will connect a running container named 'app_postgres_1' and create a backup. Once the backup command completes, the container is fully deleted.
The "docker run rm " command makes us run a new container and later when our work is completed then it is deleted by saving the disk space.
The important thing to note is, the container is just like a class instance and not for data storage. We better delete them once the work is complete. When we start again, it starts fresh.
The question comes then If the container is deleted then what about the data in a container? The data is actually saved in the local system and get linked to it when the container is started. The concept is named as "Volume or shared volume".
I'm dockerizing some of our services. For our dev environment, I'd like to make things as easy as possible for our developers and so I'm writing some scripts to manage the dockerized components. I want developers to be able to start and stop these services just as if they were non-dockerized. I don't want them to have to worry about creating and running the container vs stopping and starting and already-created container. I was thinking that this could be handled using Fig. To create the container (if it doesn't already exist) and start the service, I'd use fig up --no-recreate. To stop the service, I'd use fig stop.
I'd also like to ensure that developers are running containers built using the latest images. In other words, something would check to see if there was a later version of the image in our Docker registry. If so, this image would be downloaded and run to create a new container from that image. At the moment it seems like I'd have to use docker commands to list the contents of the registry (docker search) and compare that to existing local containers (docker ps -a) with the addition of some greping and awking or use the Docker API to achieve the same thing.
Any persistent data will be written to mounted volumes so the data should survive the creation of a new container.
This seems like it might be a common pattern so I'm wondering whether anyone else has given these sorts of scenarios any thought.
This is what I've decided to do for now for our Neo4j Docker image:
I've written a shell script around docker run that accepts command-line arguments for the port, database persistence directory on the host, log file persistence directory on the host. It executes a docker run command that looks like:
docker run --rm -it -p ${port}:7474 -v ${graphdir}:/var/lib/neo4j/data/graph.db -v ${logdir}:/var/log/neo4j my/neo4j
By default port is 7474, graphdir is $PWD/graph.db and logdir is $PWD/log.
--rm removes the container on exit, however the database and logs are maintained on the host's file system. So no containers are left around.
-it allows the container and the Neo4j service running within it to receive signals so that the service can be gracefully shut down (the Neo4j server gracefully shuts down on SIGINT) and the container exited by hitting ^C or sending it a SIGINT if the developer puts this in the background. No need for separate start/stop commands.
Although I certainly wouldn't do this in production, I think this fine for a dev environment.
I am not familiar with fig but your scenario seems good.
Usually, I prefer to kill/delete + run my container instead of playing with start/stop though. That way, if there is a new image available, Docker will use it. This work only for stateless services. As you are using Volumes for persistent data, you could do something like this.
Regarding the image update, what about running docker pull <image> every N minutes and checking the "Status" that the command returns? If it is up to date, then do nothing, otherwise, kill/rerun the container.