Cleaning up orphaned docker containers after Jenkins job is terminated - docker

I work at a large organization that runs hundreds of jobs in a shared Jenkins cluster.
My Jenkins job needs to run integration tests against untrusted code running inside Docker containers. I am fearful that that when my Jenkins job gets terminated abruptly (e.g. job aborted or times out) I will be left with orphaned containers.
I have tried https://github.com/moby/moby/issues/1905 and ulimits does not work for me (this is because it only works for containers that run bash, and I cannot guarantee that mine will do so).
I tried https://stackoverflow.com/a/26351355/14731 but --lxc-conf is not a recognized option for Docker for Windows (this needs to run across all platforms supported by docker).
Any ideas?

Well you can have a cleanup command in the first and last step of your job, for example, first clean old deads, then rename the existing contailer to old_$jobname and kill it
docker container prune -f
docker rename $jobname old$jobname
docker kill old$jobname do whatever you need
launch your new container
- docker run --name $jobname$

By the looks of things, people are handling this outside of docker.
They are adding Jenkins post-build steps that clean up orphaned docker containers on aborted or failed builds.
See Martin Kenneth's build script as an example.

Related

Execute script inside Docker Swarm node from Jenkins

I have a Django application that is deployed to Docker Swarm. (Replicated, running on several nodes)
The application contains some shell script do_stuff.sh that has to be executed periodically, which eventually ends up with executing the custom command python manage.py do_django_stuff on the node.
What I want is to schedule and trigger the execution of the script from Jenkins. (As from there it is convenient to track the results/see log output for troubleshooting etc)
The application has fixed FQDN (e.g. my_app.my_server.com) but I am struggling with how to point from Jenkins to the specific container to execute the script there.
How can I achieve that?
Thanks!
Found the solution. To do the task above - I had to do two things:
Put label on Jenkins job that targets build to the Docker Swarm cluster (picks up one of the machines)
execute the script as docker exec $(docker ps -q -f name=name_of_my_application) ./do_stuff.sh
So I achieve the behaviour when the task is executed inside Docker Swarm but has the same output as it would be on a regular deployment

How keep Docker containers running

I launch integration tests in docker.
If a test fails then container with tests exits 1 it fails Maven build.
Exactly what I want, but fabric8 plugin removes all containers and I cannot investigate what was wrong. I cannot use docker volumes to put logs there.
I need container keep running.
You can use autoRemove option as the configuration of docker:start task.
https://dmp.fabric8.io/#start-configuration
If you set autoRemove=false without specifying the restart policy, the container can be kept even after the exit.

What have to be done to deliver on Docker and avoid to accumulate images?

I use Docker to execute a website I make.
When a release have to be delivered, I have to build a new Docker image and start a new Container from it.
The problem is that images et containers are accumulating and taking huge space.
Besides the delivery, I need to stop the running container and delete it and the source image too.
I don't need Docker command lines but a checklist or a process to not forget anything.
For instance:
-Stop running container
-Delete stopped container
-Delete old image
-Build new image
-Start new container
Am I missing something?
I'm not used to Docker, maybe there are best practices to this pretty classical use case?
The local workflow that works for me is:
Do core development locally, without Docker. Things like interactive debuggers and live reloading work just fine in a non-Docker environment without weird hacks or root access, and installing the tools I need usually involves a single brew or apt-get step. Make all of my pytest/junit/rspec/jest/... tests pass.
docker build a new image.
docker stop && docker rm the old container.
docker run a new container.
When the number of old images starts to bother me, docker system prune.
If you're using Docker Compose, you might be able to replace the middle set of steps with docker-compose up --build.
In a production environment, the sequence is slightly different:
When your CI system sees a new commit, after running the repository's local tests, it docker build && docker push a new image. The image has a unique tag, which could be a timestamp or source control commit ID or version tag.
Your deployment system (could be the CI system or a separate CD system) tells whatever cluster manager you're using (Kubernetes, a Compose file with Docker Swarm, Nomad, an Ansible playbook, ...) about the new version tag. The deployment system takes care of stopping, starting, and removing containers.
If your cluster manager doesn't handle this already, run a cron job to docker system prune.
You should use:
docker system df
to investigate the space used by docker.
After that you can use
docker system prune -a --volumes
to remove unused components. Containers you should stop them yourself before doing this, but this way you are sure to cover everything.

Docker pipeline's "inside" not working in Jenkins slave running within Docker container

I'm having issues getting a Jenkins pipeline script to work that uses the Docker Pipeline plugin to run parts of the build within a Docker container. Both Jenkins server and slave run within Docker containers themselves.
Setup
Jenkins server running in a Docker container
Jenkins slave based on custom image (https://github.com/simulogics/protokube-jenkins-slave) running in a Docker container as well
Docker daemon container based on docker:1.12-dind image
Slave started like so: docker run --link=docker-daemon:docker --link=jenkins:master -d --name protokube-jenkins-slave -e EXTRA_PARAMS="-username xxx -password xxx -labels docker" simulogics/protokube-jenkins-slave
Basic Docker operations (pull, build and push images) are working just fine with this setup.
(Non-)Goals
I want the server to not have to know about Docker at all. This should be a characteristic of the slave/node.
I do not need dynamic allocation of slaves or ephemeral slaves. One slave started manually is quite enough for my purposes.
Ideally, I want to move away from my custom Docker image for the slave and instead use the inside function provided by the Docker pipeline plugin within a generic Docker slave.
Problem
This is a representative build step that's causing the issue:
image.inside {
stage ('Install Ruby Dependencies') {
sh "bundle install"
}
}
This would cause an error like this in the log:
sh: 1: cannot create /workspace/repo_branch-K5EM5XEVEIPSV2SZZUR337V7FG4BZXHD4VORYFYISRWIO3N6U67Q#tmp/durable-98bb4c3d/pid: Directory nonexistent
Previously, this warning would show:
71f4de289962-5790bfcc seems to be running inside container 71f4de28996233340c2aed4212248f1e73281f1cd7282a54a36ceeac8c65ec0a
but /workspace/repo_branch-K5EM5XEVEIPSV2SZZUR337V7FG4BZXHD4VORYFYISRWIO3N6U67Q could not be found among []
Interestingly enough, exactly this problem is described in CloudBees documentation for the plugin here https://go.cloudbees.com/docs/cloudbees-documentation/cje-user-guide/index.html#docker-workflow-sect-inside:
For inside to work, the Docker server and the Jenkins agent must use the same filesystem, so that the workspace can be mounted. The easiest way to ensure this is for the Docker server to be running on localhost (the same computer as the agent). Currently neither the Jenkins plugin nor the Docker CLI will automatically detect the case that the server is running remotely; a typical symptom would be errors from nested sh commands such as
cannot create /…#tmp/durable-…/pid: Directory nonexistent
or negative exit codes.
When Jenkins can detect that the agent is itself running inside a Docker container, it will automatically pass the --volumes-from argument to the inside container, ensuring that it can share a workspace with the agent.
Unfortunately, the detection described in the last paragraph doesn't seem to work.
Question
Since both my server and slave are running in Docker containers, what kid of volume mapping do I have to use to make it work?
I've seen variations of this issue, also with the agents powered by the kubernetes-plugin.
I think that for it to work the agent/jnlp container needs to share workspace with the build container.
By build container I am referring to the one that will run the bundle install command.
This could be possibly work via withArgs
The question is why would you want to do that? Most of the pipeline steps are being executed on master anyway and the actual build will run in the build container. What is the purpose of also using an agent?

Test Docker cluster in Jenkins

I have some difficulties to configure Jenkins to run test on a dockerized application.
First here is my set up: the project is on bitbucket and I have a docker-compose that run my application which is composed of 3 three conmtainers for now (one for mongo, one for redis, one for my node app).
The webhook of bitbucket works well and Jenkins is triggered when I push.
However what i would like to do for a build is:
get a repo where my docker-compose is, run the docker-compose in order to have my cluster running, and then run a "npm test" inside the repo (my test use mocha), and finally having Jenkins notified if the test have passed or not.
If someone could help me to get this chain of operation applied by Jenkins, it would be awesome.
The simplest way is use jenkins pipeline plugin or shell script.
To build docker image and run compose you could use docker-compose command. Important thing is that you need rebuild docker image from compose level (because if you run docker-compose run only jenkins can use previous bilded image). So you need run docker-compose build before.
Your dockerfile should copy all files of your application.
Next when your service is ready you could run command in docker image using: docker exec {CONTAINER_ID} {COMMAND_TO_RUN_TESTS}.

Resources