I'm using my laptop as a single-noded Docker Swarm cluster.
After deploying my cluster, it gets extremely slow to run a docker build command. Even if a command is cached (e.g. RUN chmod ...), it takes sometimes minutes to complete.
How can I debug this and understand what's the cause of the slowdown?
Context
Number of services in my swarm cluster: 22
Docker version: 18.04-ce
Host OS: Linux 4.15.15
Host Arch: x86_64
Host specs: i7, 16GB of RAM, SSD/HDD hybrid disk (docker images are stored in the HDD part)
Using VMs or docker-machine: No
In this case, it turned out to be too much disk I/O.
As I've mentioned above, my laptop's storage is separated into an SSD disk and an HDD disk. The docker images are stored in the HDD disk, but so are the docker volumes created (which I initially overlooked).
The cluster that I am running locally contains a PostgreSQL database that receives a lot of writes. Those writes were clogging my HDD disk, so the solution to this specific problem was to mount PostgreSQL's storage in a directory stored in the SDD disk. Find below the debug procedure.
I found this out by using iostat like instructed in this blog post:
iostat -x 2 5
By looking at the output of this command, it became clear that my HDD disk's %utilized param was up to 99%, so it was probably the culprit. Next, I ran iotop and dockerd+postgres was at the top of the list.
In conclusion, if your containers are very I/O intensive, they could slow down the whole docker infrastructure to a crawl.
Related
I have a fairly large Windows docker image + container (it has MSVS and lots of tools, based on Windows server 2022). It runs quite slowly even on my fast 16-core Threadripper Windows 11 desktop; it seems hugely disk-bound as well as taking over 50GB of disk space (and it eats more, the longer I use it). The host machine has WSL2 and Docker Desktop (with the WSL2 back-end enabled), and Hyper-V is enabled. The container is self-contained; it doesn't bind-mount any volumes from the host.
Looking at Task Manager, the C disk is pinned at 100% active time with very slow response rates; that's never good. Using procmon I see most of the disk accesses are from "vmmem" and "docker-index", and my c:\ProgramData\Docker\windowsfilter dir fills up pretty often. And I never get more than 1 or 2 CPUs worth of compute, even though I've allocated 8 CPUs to the container (probably just because it's so disk-bound).
I've read various things about how to speed up docker containers on Windows, but since I'm not 100% clear on the underlying architecture (is dockerd running in a VM? What about docker-index? The container itself? What's the filesystem driver in the container?) I'm not sure how to go about speeding it up.
Should I remove Docker Desktop and move to "plain" Windows docker as in https://lippertmarkus.com/2021/09/04/containers-without-docker-desktop/? I don't care about the desktop GUI; I only use the CLI anyway (docker build, docker-compose up, etc.).
Should I run Docker from within WSL? Would that even work with a Windows image/container?
Should I get a clean machine and only run the docker image on it?
Any other ideas?
The fastest way is:
Install a Linux distro;
Enter Linux OS
Install Docker (https://docs.docker.com/engine/install/ubuntu/);
Make your container up with docker build or docker-compose up.
I just need some kind of reference on how long it should take to create a container based on a 4GB docker image. On my computer it is currently taking >60 seconds, which causes docker-compose to timeout. Is this normal for a modern workstation with SSD disks and a decent CPU? Why does it take so long?
The docker context is ~6MB, so that should not be the issue here, but I know it could be had the context been larger.
It's running on a linux host, so it's also not the IO-overhead tax you pay when running docker in a VM like Docker for MAC does.
I just don't understand why it's so slow, if it's expected for these large images, or if I should try some other technology instead of Docker (like a virtual machine, ansible scripts or whatever).
We recently had our jenkins redone. We decided to have the new version on a docker container on the server.
While migrating, I noticed that the jenkins is MUCH slower when its in a container than when it ran on the server itself.
This is a major issue and could mess up our migration.
I tried looking for ways to give more resources to the container with not much help.
How can I speed the jenkins container/ give it all the resources it needs on the server (the server is dedicated only to jenkins).
Also, how do I devide these resources when I want to start up slave containers as well?
Disk operations
One thing that can go slow with Docker is when the process running in a container is making a lot of I/O calls to the container file system. The container file system is a union file system which is not optimized for speed.
This is where docker volumes are useful. Additionally to providing a location on the file system which survives container deletion, disk performance on a docker volume is good.
The Jenkins Docker image defines the JENKINS_HOME location as a docker volume, so as long as your Jenkins jobs are making their disk operations within that location you should be fine.
If you determine that disk access on that volume is still too slow, you could customize the mount location of that volume on your docker host so that it would end up being mounted on a fast drive such as a SSD.
Another trick is to make a docker volume mounted to RAM with tmpfs. Note that such a volume does not offer persistence and that data at that location will be lost when the container is stopped or deleted.
JVM memory exhaustion / Garbage collector
As Jenkins is a Java application, another potential issue comes in mind: memory exhaustion. In the case the JVM on which the Jenkins process runs on is too limited in memory, the Java garbage collector will runs too frequently. You can witness that when you realize your Java app is using too much CPU (the garbage collector uses CPU). If that is the case, give more memory to the JVM:
docker run-p 8080:8080 -p 50000:50000 --env JAVA_OPTS="-Xmx2048m -Djava.awt.headless=true" jenkins/jenkins:lts
Network
Docker containers have a virtual network stack and custom network settings. You also want to make sure that all network related operation are fast.
The DNS server might be an issue, check it by executing ping <some domain name> from the Jenkins container.
Does docker container get the same band-width as the host container? Or do we need to configure min and(or) max. I 've noticed that we need to override default RAM(which is 2 GB) and Swap space configuration if we need to run CPU intensive jobs.
Also do we need to configure the disk-space ? Or does it by default get as much space as the actual hard disk.
Memory and CPU are controlled using cgroups by docker. If you do not configure these, they are unrestricted and can use all of the memory and CPU on the docker host. If you run in a VM, which includes all Docker for Desktop installs, then you will be limited to that VM's resources.
Disk space is usually limited to the disk space available in /var/lib/docker. For that reason, many make this a different mount. If you use devicemapper for docker's graph driver (this has been largely deprecated), created preallocated blocks of disk space, and you can control that block size. You can restrict containers by running them with read-only root filesystems, and mounting volumes into the container that have a limited disk space. I've seen this done with loopback device mounts, but it requires some configuration outside of docker to setup the loopback device. With a VM, you will again be limited by the disk space allocated to that VM.
Network bandwidth is by default unlimited. I have seen an interesting project called docker-tc which monitors containers for their labels and updates bandwidth settings for a container using tc (traffic control).
Does docker container get the same band-width as the host container?
Yes. There is no limit imposed on network utilization. You could maybe impose limits using a bridge network.
Also do we need to configure the disk-space ? Or does it by default get as much space as the actual hard disk.
It depends on which storage driver you're using because each has its own options. For example, devicemapper uses 10G by default but can be configured to use more. The recommended driver now is overlay2. To configure start docker with overlay2.size.
This depends some on what your host system is and how old it is.
In all cases network bandwidth isn't explicitly limited or allocated between the host and containers; a container can do as much network I/O as it wants up to the host's limitations.
On current native Linux there isn't a desktop application and docker info will say something like Storage driver: overlay2 (overlay and aufs are good here too). There are no special limitations on memory, CPU, or disk usage; in all cases a container can use up to the full physical host resources, unless limited with a docker run option.
On older native Linux there isn't a desktop application and docker info says Storage driver: devicemapper. (Consider upgrading your host!) All containers and images are stored in a separate filesystem and the size of that is limited (it is included in the docker info output); named volumes and host bind mounts live outside this space. Again, memory and CPU are not intrinsically limited.
Docker Toolbox and Docker for Mac both use virtual machines to provide a Linux kernel to non-Linux hosts. If you see a "memory" slider you are probably using a solution like this. Disk use for containers, images, and named volumes is limited to the VM capacity, along with memory and CPU. Host bind mounts generally get passed through to the host system.
I know that containers are a form of isolation between the app and the host (the managed running process). I also know that container images are basically the package for the runtime environment (hopefully I got that correct). What's confusing to me is when they say that a Docker image doesn't retain state. So if I create a Docker image with a database (like PostgreSQL), wouldn't all the data get wiped out when I stop the container and restart? Why would I use a database in a Docker container?
It's also difficult for me to grasp LXC. On another question page I see:
LinuX Containers (LXC) is an operating system-level virtualization
method for running multiple isolated Linux systems (containers) on a
single control host (LXC host)
What does that exactly mean? Does it mean I can have multiple versions of Linux running on the same host as long as the host support LXC? What else is there to it?
LXC and Docker, Both are completely different. But we say both are container holders.
There are two types of Containers,
1.Application Containers: Whose main motto is to provide application dependencies. These are Docker Containers (Light Weight Containers). They run as a process in your host and gets all the things done you want. They literally don't need any OS Image/ Boot Up thing. They come and they go in a matter of seconds. You cannot run multiple process/services inside a docker container. If you want, you can do run multiple process inside a docker container, but it is laborious. Here, resources (CPU, Disk, Memory, RAM) will be shared.
2.System Containers: These are fat Containers, means they are heavy, they need OS Images
to launch themselves, at the same time they are not as heavy as Virtual Machines, They are very similar to VM's but differ in architecture a bit.
In this, Let us say Ubuntu as a Host Machine, if you have LXC installed and configured in your ubuntu host, You can run a Centos Container, a Ubuntu(with Differnet Version), a RHEL, a Fedora and any linux flavour on top of a Ubuntu Host. You can also run multiple process inside an LXC contianer. Here also resoucre sharing will be done.
So, If you have a huge application running in one LXC Container, it requires more resources, simultaneously if you have another application running inside another LXC container which require less resources. The Container with less requirement will share the resources with the container with more resource requirement.
Answering Your Question:
So if I create a Docker image with a database (like PostgreSQL), wouldn't all the data get wiped out when I stop the container and restart?
You won't create a database docker image with some data to it(This is not recommended).
You run/create a container from an image and you attach/mount data to it.
So, when you stop/restart a container, data will never gets lost if you attach that data to a volume as this volume resides somewhere other than the docker container (May be a NFS Server or Host itself).
Does it mean I can have multiple versions of Linux running on the same host as long as the host support LXC? What else is there to it?
Yes, You can do this. We are running LXC Containers in our production.