Self Hosted Github Actions Machine running out of Disk Space in a few hours - docker

We put together a very powerful machine at my company for CI, using GitHub Actions. It's a very powerful machine, so we have ~15 Github Actions runners, running directly on the linux machine. We also run all of our jobs using the container syntax. i.e.
build-js-servers:
name: Build JS Servers
runs-on: [ self-hosted, docker ]
container:
image: node:16
We're running into an issue now where we are creating so many dangling containers, our device is running out of space in a few hours. We have a nightly build that runs docker system prune --all -f on the machine, but it appears to not be enough.
All the containers or images that github actions creates though are pointless, waste space, and unnecessarily wear out our SSD. Is there a way we can configure the temporary containers not to be saved to disk at all? That would be an ideal solution.

You can use a cronjob to run docker system prune --all -f every day at midnight.
Another option is to use a Post-job script to clean up the workspace after job execution, this feature is still in beta, so consider adding a step at the very end of your workflow that runs rm -rf to remove the files from the GitHub workspace.
Also, consider mounting a secondary device using an NFS such as Amazon EFS or Google Cloud Filestore that does offer elasticity, later on, you can change the Docker daemon configuration using the data-root parameter to the mounted device.

Related

What have to be done to deliver on Docker and avoid to accumulate images?

I use Docker to execute a website I make.
When a release have to be delivered, I have to build a new Docker image and start a new Container from it.
The problem is that images et containers are accumulating and taking huge space.
Besides the delivery, I need to stop the running container and delete it and the source image too.
I don't need Docker command lines but a checklist or a process to not forget anything.
For instance:
-Stop running container
-Delete stopped container
-Delete old image
-Build new image
-Start new container
Am I missing something?
I'm not used to Docker, maybe there are best practices to this pretty classical use case?
The local workflow that works for me is:
Do core development locally, without Docker. Things like interactive debuggers and live reloading work just fine in a non-Docker environment without weird hacks or root access, and installing the tools I need usually involves a single brew or apt-get step. Make all of my pytest/junit/rspec/jest/... tests pass.
docker build a new image.
docker stop && docker rm the old container.
docker run a new container.
When the number of old images starts to bother me, docker system prune.
If you're using Docker Compose, you might be able to replace the middle set of steps with docker-compose up --build.
In a production environment, the sequence is slightly different:
When your CI system sees a new commit, after running the repository's local tests, it docker build && docker push a new image. The image has a unique tag, which could be a timestamp or source control commit ID or version tag.
Your deployment system (could be the CI system or a separate CD system) tells whatever cluster manager you're using (Kubernetes, a Compose file with Docker Swarm, Nomad, an Ansible playbook, ...) about the new version tag. The deployment system takes care of stopping, starting, and removing containers.
If your cluster manager doesn't handle this already, run a cron job to docker system prune.
You should use:
docker system df
to investigate the space used by docker.
After that you can use
docker system prune -a --volumes
to remove unused components. Containers you should stop them yourself before doing this, but this way you are sure to cover everything.

Rapidly modifying Python app in K8S pod for debugging

Background
I have a large Python service that runs on a desktop PC, and I need to have it run as part of a K8S deployment. I expect that I will have to make several small changes to make the service run in a deployment/pod before it will work.
Problem
So far, if I encounter an issue in the Python code, it takes a while to update the code, and get it deployed for another round of testing. For example, I have to:
Modify my Python code.
Rebuild the Docker container (which includes my Python service).
scp the Docker container over to the Docker Registry server.
docker load the image, update tags, and push it to the Registry back-end DB.
Manually kill off currently-running pods so the deployment restarts all pods with the new Docker image.
This involves a lot of lead time each time I need to debug a minor issue. Ideally, I've prefer being able to just modify the copy of my Python code already running on a pod, but I can't kill it (since the Python service is the default app that is launched, with PID=1), and K8S doesn't support restarting a pod (to my knowledge). Alternately, if I kill/start another pod, it won't have my local changes from the pod I was previously working on (which is by design, of course; but doesn't help with my debug efforts).
Question
Is there a better/faster way to rapidly deploy (experimental/debug) changes to the container I'm testing, without having to spend several minutes recreating container images, re-deploying/tagging/pushing them, etc? If I could find and mount (read-write) the Docker image, that might help, as I could edit the data within it directly (i.e. new Python changes), and just kill pods so the deployment re-creates them.
There are two main options: one is to use a tool that reduces or automates that flow, the other is to develop locally with something like Minikube.
For the first, there are a million and a half tools but Skaffold is probably the most common one.
For the second, you do something like ( eval $(minikube docker-env) && docker build -t myimagename . ) which will build the image directly in the Minikube docker environment so you skip steps 3 and 4 in your list entirely. You can combine this with a tool which detects the image change and either restarts your pods or updates the deployment (which restarts the pods).
Also FWIW using scp and docker load is very not standard, generally that would be combined into docker push.
I think your pain point is the container relied on the python code. You can find a way to exclude the source code from docker image build phase.
For my experience, I will create a docker image only include python package dependencies, and use volume to map source code dir to the container path, so you don't need to rebuild the image if no dependencies are added or removed.
Example
I have not much experience with k8s, but I believe it must be more or less the same as docker run.
Dockerfile
FROM python:3.7-stretch
COPY ./python/requirements.txt /tmp/requirements.txt
RUN pip install --no-cache-dir -r /tmp/requirements.txt
ENTRYPOINT ["bash"]
Docker container
scp deploy your code to the server, and map your host source path to the container source path like this:
docker run -it -d -v /path/to/your/python/source:/path/to/your/server/source --name python-service your-image-name
With volume mapping, your container no longer depend on the source code, you can easily change your source code without rebuilding your image.

Is git pull, docker-compose build and docker-compose up -d a good way to deploy complete solution on an empty machine

Recently, we just finished web application solution using Docker.
https://github.com/yccheok/celery-hello-world/tree/nginx (The actual solution is hosted in private repository. This example just a quick glance on how our project structure looks like)
We plan to purchase 1 empty Linux machine on deploy on it. We might purchase more machines in the future but with current traffic right now, 1 machine will be sufficient.
My plan for deployment on the single empty machine is
git pull <from private code repository>
docker-compose build
docker-compose up -d
Since we are going to deploy to multiple machines in near future, I was wondering, is this a common practice to deploy docker application into a fresh empty machine?
Is there anything we can utilize from https://hub.docker.com/ , without requiring us to perform git pull during deployment stage?
You don't want to perform git pull in each machine - your intuition is correct.
Instead you want to use remote docker registry (as docker hub for example).
So the right flow, each time your source code (git repo) is changed:
git pull from all relevant repos.
docker-compose build to build all relevant images.
docker-compose push to push all images (diff) to remote registry.
docker-compose pull in your production machines, to get the latest updated images.
docker-compose up to start all containers.
First 3 steps should be done in your CI machine (for example, as a jenkins job). Steps 4-5 in your production machines.
EDIT: one thing to consider. I think build via docker-compose is bad. Consider building directly by docker build -f Dockerfile -t repo/image:tag . and in docker-compose just specify the image name.
My opinion is you should not BUILD images on production machines. Because the image might be different than you would expect and you should limit yourself what you do on production machines.. With that being said, i would recommend:
updating the code on your local computer (development)
when you push code to git, you should use some software to build
your images from your push. For example Gitlab-CI (Continuous
integration tool)
gitlab-ci will build the image, then it could run some tests on that
image, and then deploy it to production (this build image)
On you production machine just do docker-compose pull &&
docker-compose up -d and that is it.
I strongly recommend to build images on other machine than production machines, and use some CI tool to test your images before deploying. For example https://docs.gitlab.com/ce/ci/README.html
Deploying it on a fresh machine or the other way around would be fine.
The best way to go around is to make a private repo on https://hub.docker.com/ and push your images there.
Building and shipping the image
git pull
docker build
docker login
docker push repo/image
Pulling the shipped image and deploying
docker login on the server
docker pull repo/image
docker-compose up -d
Though i would recommend you to look at container scheduling using kubernetes and setting up your CI/CD stack with jenkins to automate this process, in case something bad happens it can be a life saver.

How are Packer and Docker different? Which one should I prefer when provisioning images?

How are Packer and Docker different? Which one is easier/quickest to provision/maintain and why? What is the pros and cons of having a dockerfile?
Docker is a system for building, distributing and running OCI images as containers. Containers can be run on Linux and Windows.
Packer is an automated build system to manage the creation of images for containers and virtual machines. It outputs an image that you can then take and run on the platform you require.
For v1.8 this includes - Alicloud ECS, Amazon EC2, Azure, CloudStack, DigitalOcean, Docker, Google Cloud, Hetzner, Hyper-V, Libvirt, LXC, LXD, 1&1, OpenStack, Oracle OCI, Parallels, ProfitBricks, Proxmox, QEMU, Scaleway, Triton, Vagrant, VirtualBox, VMware, Vultr
Docker's Dockerfile
Docker uses a Dockerfile to manage builds which has a specific set of instructions and rules about how you build a container.
Images are built in layers. Each FROM RUN ADD COPY commands modify the layers included in an OCI image. These layers can be cached which helps speed up builds. Each layer can also be addressed individually which helps with disk usage and download usage when multiple images share layers.
Dockerfiles have a bit of a learning curve, It's best to look at some of the official Docker images for practices to follow.
Packer's Docker builder
Packer does not require a Dockerfile to build a container image. The docker plugin has a HCL or JSON config file which start the image build from a specified base image (like FROM).
Packer then allows you to run standard system config tools called "Provisioners" on top of that image. Tools like Ansible, Chef, Salt, shell scripts etc.
This image will then be exported as a single layer, so you lose the layer caching/addressing benefits compared to a Dockerfile build.
Packer allows some modifications to the build container environment, like running as --privileged or mounting a volume at build time, that Docker builds will not allow.
Times you might want to use Packer are if you want to build images for multiple platforms and use the same setup. It also makes it easy to use existing build scripts if there is a provisioner for it.
Expanding on the Which one is easier/quickest to provision/maintain and why? What are the pros and cons of having a docker file?`
From personal experience learning and using both, I found: (YMMV)
docker configuration was easier to learn than packer
docker configuration was harder to coerce into doing what I wanted than packer
speed difference in creating the image was negligible, after development
docker was faster during development, because of the caching
the docker daemon consumed some system resources even when not using docker
there are a handful of processes running as the daemon
I did my development on Windows, though I was targeting LINUX servers for running the images.
That isn't an issue during development, except for a foible of running Docker on Windows.
The docker daemon reserves various TCP port ranges for itself
The ranges might change every time you reboot your system or restart the daemon
The only error message is to the effect: can't use that port! but not why it can't
BTW, The workaround is to:
turn off Hypervisor
reboot
reserve the public ports you want your host system to see
turn on hypervisor
reboot
Running packer on Windows, however, the issue I found is that the provisioner I wanted to use, ansible, doesn't run on Windows.
Sigh.
So I end up having to run packer on a LINUX system after all.
Just because I was feeling perverse, I wrote a Dockerfile so I could run both packer and ansible from my Windows station in a docker container using that image.
Docker builds images using a Dockerfile.
These can be run (Docker containers).
Packer also builds images. But you don't need a Dockerfile. And you get the option of using Provisioners such as Ansible which lets you create vastly more customisable images. It isn't used for running these images.

Cleaning up orphaned docker containers after Jenkins job is terminated

I work at a large organization that runs hundreds of jobs in a shared Jenkins cluster.
My Jenkins job needs to run integration tests against untrusted code running inside Docker containers. I am fearful that that when my Jenkins job gets terminated abruptly (e.g. job aborted or times out) I will be left with orphaned containers.
I have tried https://github.com/moby/moby/issues/1905 and ulimits does not work for me (this is because it only works for containers that run bash, and I cannot guarantee that mine will do so).
I tried https://stackoverflow.com/a/26351355/14731 but --lxc-conf is not a recognized option for Docker for Windows (this needs to run across all platforms supported by docker).
Any ideas?
Well you can have a cleanup command in the first and last step of your job, for example, first clean old deads, then rename the existing contailer to old_$jobname and kill it
docker container prune -f
docker rename $jobname old$jobname
docker kill old$jobname do whatever you need
launch your new container
- docker run --name $jobname$
By the looks of things, people are handling this outside of docker.
They are adding Jenkins post-build steps that clean up orphaned docker containers on aborted or failed builds.
See Martin Kenneth's build script as an example.

Resources