Cron job to kill all hanging docker containers - docker

I am new to docker containers but we have containers being deployed and due to some internal application network bugs the process running in the container hangs and the docker container is not terminated. While we debug this issue I would like a way to find all those containers and setup a cron job to periodically check and kill those relevant containers.
So how would I determine from "docker ps -a" which containers should be dropped and how would I go about it? Any ideas? We are eventually moving to kubernetes which will help with these issues.

Docker already have a command to cleanup the docker environment, you can use it manually or maybe setup a job to run the following command:
$ docker system prune
Remove all unused containers, networks, images (both dangling and
unreferenced), and optionally, volumes.
refer to the documentation for more details on advanced usage.

Related

How does Vagrant restart docker containers on bootup?

I have a Vagrant configuration that provisions a few docker containers.
I start the machine by "vagrant up", and then "vagrant halt" the machine, and remove the provisioning of these containers.
On "vagrant up" I see these containers starting up anyways. It seems like the provisioning from the last run persisted somehow. I can only assume that the provisioning model is persistent. Is it?
How does Vagrant arrange for these containers to start at boot? How do I stop this from happening?
I doubt it's Vagrant per se restarting the containers unless that was specifically built into your VM. It really depends on a lot of factors -- starting with the Docker restart policy -- but also could be a factor of how your Docker daemon is setup or how the "halt" event is handled by the VM host.
The Docker images and each container’s file system persists after Docker shuts down, so you could provide some cleanup script to remove them prior to shutting down as well as make sure to set the restart policy --restart=no (which should be the default). (You should be able to docker inspect -f "{{ .HostConfig.RestartPolicy }}" <container> to view current policy.)

Checking reason behind node failure

I have docker swarm setup with nodes running node-1, node-2 and node-3. Due to some reason everyday one of my node is getting failed basically they exits. I ran docker logs <container id of swarm> but logs doesn't contains any info related to node failure.
So, is there any logs file where a log related to this failure can be seen? or this is due to some less memory allocation problem?
Can any one suggest me how to dig this problem and find a proper solution. As everyday I have to start by swarm nodes.
Like most containers, Swarm containers run and exit unless you use docker run with the -d option to "daemonize" them. For example:
$ docker run -d swarm join --advertise=172.30.0.69:2375 consul://172.30.0.161:8500
On the other hand, if you used Docker Machine to create the VMs, then also use Docker Machine to create the Swarm manager and nodes. By default, Docker Machine applies TLS authentication to the Docker Engine nodes. The easiest thing to do is to also create the Swarm manager and nodes at the same time as you create the Docker Engine nodes.
For more info, check out the brand new Swarm doc.

What is the difference between docker-compose up and docker-compose start?

Whenever I execute
docker-compose start
docker-compose ps
I see my containers with the state "UP". If I do
docker-compose up -d
I will see more verbose but it will have the same state. Is there any difference between both commands?
docker-compose start
(https://docs.docker.com/compose/reference/start/)
Starts existing containers for a service.
docker-compose up
(https://docs.docker.com/compose/reference/up/)
Builds, (re)creates, starts, and attaches to containers for a service.
Unless they are already running, this command also starts any linked services.
The docker-compose up command aggregates the output of each container
(essentially running docker-compose logs -f). When the command exits,
all containers are stopped. Running docker-compose up -d starts the
containers in the background and leaves them running.
If there are existing containers for a service, and the service’s
configuration or image was changed after the container’s creation,
docker-compose up picks up the changes by stopping and recreating the
containers (preserving mounted volumes). To prevent Compose from
picking up changes, use the --no-recreate flag.
For the complete CLI reference:
https://docs.docker.com/compose/reference/
In docker Frequently asked questions this is explained very clearly:
What’s the difference between up, run, and start?
Typically, you want docker-compose up. Use up to start or restart
all the services defined in a docker-compose.yml. In the default
“attached” mode, you see all the logs from all the containers. In
“detached” mode (-d), Compose exits after starting the containers, but
the containers continue to run in the background.
The docker-compose run command is for running “one-off” or “adhoc”
tasks. It requires the service name you want to run and only starts
containers for services that the running service depends on. Use run
to run tests or perform an administrative task such as removing or
adding data to a data volume container. The run command acts like
docker run -ti in that it opens an interactive terminal to the
container and returns an exit status matching the exit status of the
process in the container.
The docker-compose start command is useful only to restart containers
that were previously created, but were stopped. It never creates new
containers.
What is the difference between up, run and start in Docker Compose?
docker-compose up: Builds, (re)creates, and starts containers. It also attaches to containers for a service.
docker-compose run: Run one-off or ad-hoc tasks based on the business requirements. Here, the service name has to be provided and the docker starts only that specific service and also the other services to which the target service is dependent (if any). It is helpful for testing the containers and also for performing tasks
docker-compose start: Start the stopped containers, can't create new ones.

Deploying changes to Docker and its containers on the fly

Brand spanking new to Docker here. I have Docker running on a remote VM and am running a single dummy container on it (I can verify the container is running by issuing a docker ps command).
I'd like to secure my Docker installation by giving the docker user non-root access:
sudo usermod -aG docker myuser
But I'm afraid to muck around with Docker while any containers are running in case "hot deploys" create problems. So this has me wondering, in general: if I want to do any sort of operational work on Docker (daemon, I presume) while there are live containers running on it, what do I have to do? Do all containers need to be stopped/halted first? Or will Docker keep on ticking and apply the updates when appropriate?
Same goes for the containers themselves. Say I have a myapp-1.0.4 container deployed to a Docker daemon. Now I want to deploy myapp-1.0.5, how does this work? Do I stop 1.0.4, remove it from Docker, and then deploy/run 1.0.5? Or does Docker handle this for me under the hood?
if I want to do any sort of operational work on Docker (daemon, I presume) while there are live containers running on it, what do I have to do? Do all containers need to be stopped/halted first? Or will Docker keep on ticking and apply the updates when appropriate?
Usually, all containers are stopped first.
That happen typically when I upgrade docker itself: I find all my container stopped (except the data containers, which are just created, and remain so)
Say I have a myapp-1.0.4 container deployed to a Docker daemon. Now I want to deploy myapp-1.0.5, how does this work? Do I stop 1.0.4, remove it from Docker, and then deploy/run 1.0.5? Or does Docker handle this for me under the hood?
That depend on the nature and requirements of your app: for a completely stateless app, you could even run 1.0.5 (with different host ports mapped to your app exposed port), test it a bit, and stop 1.0.4 when you think 1.0.5 is ready.
But for an app with any kind of shared state or resource (mounted volumes, shared data container, ...), you would need to stop and rm 1.0.4 before starting the new container from 1.0.5 image.
(1) why don't you stop them [the data containers] when upgrading Docker?
Because... they were never started in the first place.
In the lifecycle of a container, you can create, then start, then run a container. But a data container, by definition, has no process to run: it just exposes VOLUME(S), for other container to mount (--volumes-from)
(2) What's the difference between a data/volume container, and a Docker container running, say a full bore MySQL server?
The difference is, again, that a data container doesn't run any process, so it doesn't exit when said process stops. That never happens, since there is no process to run.
The MySQL server container would be running as long as the server process doesn't stop.

sharing docker.sock or docker in docker (dind)

We're running a mesos cluster and jenkins for continuous integration workflow.
Jenkins is configured with the mesos plugin.
Previously we built our docker images in mesos containers. Now we are switching to docker containers for building our docker images.
I've been searching for the advantage of building our docker images inside a docker container with dind image like this one "dind-jenkins-slave" found on docker hub.
With dind you lose the caching opportunities when sharing the docker.sock of the host. And with dind you also have to push the privileged parameter.
What is the downside of just sharing the docker.sock of the host?
I'm using sharing docker.sock approach. The only downside which I see is security - you could do everything what you want with the host when you could run any docker containers. But if you trust people who create jobs or could control which docker containers with which options could be run from jenkins then giving access to main docker daemon is easy solution.
It depends on what you're doing, really. To get our jenkins jobs truly isolated so that we can run as many as we want in parallel, we switched to DinD. If you share the host socket you still only have a single docker daemon- port conflicts, pulling/pushing multiple images from separate jobs, and one job relying on an image or build that is also being messed with by another job are all issues.
To get around the caching issue, I create the dind container and leave it around. I run
docker start -a dindslave || docker run -v ${WORKSPACE}:/data my/dindimage jenkinscommands.sh
Then jenkins just writes its commands to jenkinscommands.sh and restarts the container every time. When you remove the container you remove your cache as well, and you don't share caches between jobs if that is something you want... but you don't have to think about jobs interfering with one another or making sure they are running on the same host.

Resources