Poor file transfer performance from container, recent change

Poor file transfer performance from container, recent change - docker

I run numerous containers on a single docker instance, a few of which have the need to move files around between local and remove file systems that are mounted on the host as CIFS devices, then presented to the container as volumes. In the past I haven’t had an issue when transferring data between these volumes, however recently I’ve seen the performance drop and transfer rates only running at less than 10 MByte/s when they used to run at around 60 MByte/s.
What are some things I can check in the configuration of either the docker engine on my server, or the container configuration?
I using the docker-ce package on Centos 7 for these containers.

Related

ng command very slow on docker volume (Windows)

I use Docker Desktop on Windows 10 (WSL) and need to use Angular on a Docker Volume (with the -v option). Everything works correctly, but the "ng" command seems very slow when it's run from the volume.
I first noticed this running ng serve: the command hangs for more than 1 minute with no log (even in verbose mode) before beginning the compilation. But even ng --version hangs for 15 seconds when it's run from any directory in the volume (the version is 8.1.2) - without any error message (and no docker log). If I run ng --version from any other folder in the container (not in the volume), the version is displayed immediately.
Would you know the reason of this delay or any way to understand and solve it?

I suspect that the main issue is due to the fact that ng commands are read/write intensive. That being said, the Visual Studio Code devcontainer doc indicates:
While using this approach to bind mount the local filesystem into a container is convenient, it does have some performance overhead on Windows and macOS. There are some techniques that you can apply to improve disk performance, or you can open a repository in a container using a isolated container volume instead.
Therefore, instead of mounting the current directory, it would be better in that case to clone the repository in an isolated container volume.
To do so, in VS Code, open the command palette by pressing F1 and select Remote-Containers: Clone Repository in Container Volume. This will create a unique volume for your container with your repository inside.
The techniques mentioned in the quote can be found here.

Live upgrade with docker

I'm making a docker image for a daemon that can be upgraded live without restarting. And I'm making it minimal by using a multistage build and start everything with docker-compose.
Since this daemon has most of its features in loadable modules, upgrades are usually just a matter of reloading them. Which is a very nice feature to have, because restarting the daemon would mean disconnecting all the users. But I don't know how to keep this feature with a docker image.
A shared volume obviously come to mind, but this doesn't seem to play well with a multistage build or with docker-compose.

Unfortunatley, I don't think this is possible with docker. As docker images are immutable, you need to create a new image with the new version of unrealircd. From that image you can start a new docker container. Using a shared volume would be possible in theory but that is really not the intended use-case of volumes. Volumes are the preferred mechanism for persisting data generated by and used by Docker containers. If you use them to store the modules of unrealircd you loose the ability to just take your docker image and start another container with the same application in it.

Docker Storage - Getting a Layman's answer

I am just discovering Docker - I am finding so much information, but I can't seem to get a straight answer on this option. If someone could give me a clear explanation based on my understanding I have of it so far it would be appreciated.
I am downloading a docker image locally - say the default one from Microsoft, using microsoft/dotnet-samples:dotnetapp-nanoserver I am lost as to where this is downloaded to? Is this downloaded and installed as a program on the host machine, with a isolated script that controls the container? The download is about 1.3 gigs because it includes .Net Core
In another example, if I download apache2 to run as a web server, does it install it in the default paths on the host system, but every container I want to use taps into that - or does every container contain it's isolated version of apache2?
I ask this because I can't find files that mimic the file size of these programs.
I know they are not complete VM's but where can I find the files associated with a container?
I am using Windows Server 2016 and a Mac since I want to do some trials with containers.

An image is a filesystem
Docker images are encapsulated filesystems. The software and files inside are not being directly installed onto your system.
You can think of a Docker image sort of the way you think of a .zip file. You can download a .zip file from somewhere, and it is a single file. Contained inside it might be one file, or dozens of files, or a nested tree of directories and files. But on your disk, it exists as one file.
A Docker image is similar (conceptually, at least... the details are more complicated).
Image storage
Where images are stored varies by platform. On a Linux system, they are usually under /var/lib/docker. I don't know where they are stored on Windows, but this is a more or less opaque store. Poking around inside will not reveal very much to you anyway.
To see what you have, you should use the docker images command. It will show you the images you have stored locally.
Like I said earlier, each image may consist of multiple layers. By default, that command will only show you the top layer, which is the one you'll care about, to run containers from. Technically, there are other layers, and you can see all of them using docker images -a.
Where is the software installed?
When you download an Apache image, nothing is installed on your system at all. The image file(s) are downloaded and stored. Hiding inside is Apache and everything Apache needs in order to run, but Apache is not installed onto your Windows OS anywhere.
When you want to use Apache, you would run a container. Docker takes the Apache image and, using it as a starting template, creates a running process container, inside of which Apache is running. This is isolated from your operating system. Apache is only running inside of the container.
If you run a second container from the Apache image, you now have two completely separate Apache instances running, each in their own isolated filesystem environment.
Where can I find the files?
If you just want to poke around in the container filesystem, you can start the container in interactive mode, and run a shell instead of whatever it normally runs (like Apache). For instance, if you have an image apache:latest, you can do this:
docker run --rm -it apache:latest bash
This will run an instance of apache:latest, but instead of launching Apache, it will run a bash shell and drop you into it.
The --rm flag is convenient for cases like this. It tells Docker to remove the running container when its process exits. That way for a "just looking at something" container like this one, it cleans up after itself.
The -it is actually two flags. -i is interactive mode, and -t allocates a terminal. This is a common flag to pass when you want to directly interact with the container.
Once inside, you can use the usual commands to look at files and directory listings. Note that many containers are stripped-down, though. You don't always have all of the tools you are used to having. Things like ls in Linux are typically there, but a lot of things will not be.
Simply exit when you're done looking around to exit.
Looking around while the process is running
You can also look at the container while Apache is running. First start it normally.
docker run -d apache:latest
This will return a container ID. You can also get the ID from docker ps. Then you can attach to the container with that ID by executing a shell.
docker exec -it <container_id> bash
Now you're in the container in a shell, but Apache is in there running.

How persistent are docker data-only containers

I'm a bit confused about data-only docker containers. I read it's a bad practice to mount directories directly to the source-os: https://groups.google.com/forum/#!msg/docker-user/EUndR1W5EBo/4hmJau8WyjAJ
And I get how I make data-only containers: http://container42.com/2014/11/18/data-only-container-madness/
And I see somewhat similar question like mine: How to deal with persistent storage (e.g. databases) in docker
But what if I have a lamp-server setup.. and I have everything nice setup with data-containers, not linking them 'directly' to my source-os and make a backup once a while..
Than someone comes by, and restarts my server.. How do I setup my docker (data-only)-containers again, so I don't lose any data?

Actually, even though it was shykes who said it was considered a "hack" in that link you provide, note the date. Several eons worth of Docker years have passed since that post about volumes, and it's no longer considered bad practice to mount volumes on the host. In fact, here is a link to the very same shykes saying that he has "definitely used them at large scale in production for several years with no issues". Mount a host OS directory as a docker volume and don't worry about it. This means that your data persists across docker restarts/deployments/whatever. It's right there on the disk of the host, and doesn't go anywhere when your container goes away.
I've been using docker volumes that mount host OS directories for data storage (database persistent storage, configuration data, et cetera) for as long as I've been using Docker, and it's worked perfectly. Furthermore, it appears shykes no longer considers this to be bad practice.

Docker containers will persist on disk until they are explicitly deleted with docker rm. If your server restarts you may need to restart your service containers, but your data containers will continue to exist and their volumes will be available to other containers.

docker rm alone doesn't remove the actual data (which lives on in /var/lib/docker/vfs/dir)
Only docker rm -v would clear out the data as well.
The only issue is that, after a docker rm, a new docker run would re-create an empty volume in /var/lib/docker/vfs/dir.
In theory, you could with symlink redirect the new volume folders to the old ones, but that supposes you notes which volumes were associated to which data container... before the docker rm.

It's worth noting that the volumes you create with "data-only containers" are essentially still directories on your host OS, just in a different location (/var/lib/docker/...). One benefit is that you get to label your volumes with friendly identifiers and thus you don't have to hardcode your directory paths.
The downside is that administrative work like backing up specific data volumes is a bit of a hassle now since you have to manually inspect metadata to find the directory location. Also, if you accidentally wipe your docker installation or all of your docker containers, you'll lose your data volumes.

Docker container and memory consumption

Assume I am starting a big number of docker containers which are based on the same docker image. It means that each docker container is running the same application. It could be the case that the application is big enough and requires a lot of hard drive memory.
How is docker dealing with it?
Does all docker containers sharing the static part defined in the docker image?
If not does it make sense to copy the application into some directory on the machine which is used to run docker containers and to mount this app directory for each docker container?

Docker shares resources at kernel level. This means application logic is in never replicated when it is ran. If you start notepad 1000 times it is still stored only once on your hard disk, the same counts for docker instances.
If you run 100 instances of the same docker image, all you really do is keep the state of the same piece of software in your RAM in 100 different separated timelines. The hosts processor(s) shift the in-memory state of each of these container instances against the software controlling it, so you DO consume 100 times the RAM memory required for running the application.
There is no point in physically storing the exact same byte-code for the software 100 times because this part of the application is always static and will never change. (Unless you write some crazy self-altering piece of software, or you choose to rebuild and redeploy your container's image)
This is why containers don't allow persistence out of the box, and how docker differs from regular VM's that use virtual hard disks. However, this is only true for the persistence inside the container. The files that are being changed by docker software on the hard disk are "mounted" into containers using the docker volumes and thus arent really part of the docker environments, but just mounted into them. (Read more about this at: https://docs.docker.com/userguide/dockervolumes/)
Another question that you might want to ask when you think about this, is how does docker store changes that it makes to its disk on runtime. What is really sweet to check out, is how docker actually manages to get this working. The original state of the container's hard disk is what is given to it from the image. It can NOT write to this image. Instead of writing to the image, a diff is made of what is changed in the containers internal state in comparison to what is in the docker image.
Docker uses a technology called "Union Filesystem", which creates a diff layer on top of the initial state of the docker image.
This "diff" (referenced as the writable container in the image below) is stored in memory and disappears when you delete your container. (Unless you use the command "docker commit", however: I don't recommend this. The state of your new docker image is not represented in a dockerfile and can not easily be regenerated from a rebuild)

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart