I would like to power a docker instance in ram... totally inside ram... using tmpfs
Can it be done?
I'm not sure how docker uses filesystems as I'm too used to using kvm and xen, they both need to set up a default size before it can be used.
So how does "docker fs" work?
This can be done. If you mount /var/lib/docker on a tmpfs, Docker can use other storage backends, like OverlayFS, on top of it.
Docker uses what it calls a "Union File System", made up of multiple read-only layers with a copy-on-write layer on top (see http://docs.docker.com/terms/layer/). It can use one of several storage drivers for this (in order of preference): AUFS, BTRFS, devicemapper, overlayfs or VFS. Because of this, no, I don't think you will be able to use tmpfs.
More information at https://developerblog.redhat.com/2014/09/30/overview-storage-scalability-docker/
Related
I am mounting an AWS efs filesystem on /var/lib/docker and using it as the default docker backing filesystem. Storage driver is overlay2. I see in the docs that overlay2 only supports xfs and ext. My aim is to mount this backing filesystem on multiple machines so that all those machines have the image data but multiple mount is not supported by aws ebs(being a ext4 and a supported backing fs by overlay2). One way could by that is pull the images on an ext4 fs and cp the image data into the efs but it would be too time taking. What could be another way to go about this?
The short answer is "don't do that" because /var/lib/docker is not designed to be shared by multiple daemons. You'll find race conditions, erroneous output about networks and containers that don't exist locally, and other errors that won't be fixed/supported.
Instead, put a registry near your cluster, in the same VPC/AZ, and have your nodes pull from that cluster. Or have a look at the work done to support estargz in runtimes like containerd which can start running a container before the layers are completely pulled.
Is there any way to limit the size that a mounted docker volume can grow to? I'm thinking of doing it like it's done here: How to set limit on directory size in Linux? but I feel it's a bit too convoluted for what I need.
By default when you mount a host directory/volume into your Docker container while running it, the Docker container gets full access to that directory and can use as much space as is available.
The way you're trying is yeah, tedious.
What you can do is, for, eg. maybe mount a new partition to your server, (maybe an EBS to your EC2 instance) with limited size and mount that inside your container and that will suit your purpose.
I understand Docker volumes and the way they refer to directories on the host. What about the rest of the filesystem within a container?
To think about it a different way: suppose you have a server with most of the storage on a remote drive, meaning reads and writes take longer than usual. If you don't mount any volumes, would it keep any/some/most/all of the container filesystem in RAM? Or does it write some amount of it to disk, meaning it would be just as slow as a volume in this case?
Non-volume data is stored in a layered overlay filesystem (in most distributions, this will be either an AUFS or DeviceMapper filesystem). The principle is the same in both cases (image source):
As already mentioned in comments, I can recommend reading the section "Understand images, containers and storage drivers" from the official documentation. This answer is just a short summary.
Each Docker image consists of multiple layers of filesystem images. For example, an Apache+PHP image might consist of (1) a generic Ubuntu base layer, (2) an additional layer with the Apache HTTP server installed and (3) another layer on top with PHP-FPM and configuration files (just an example).
When you start a new container from an image, a new per-container layer is added to the existing image layers. This layer will contain all changes that is written within the container itself (to non-volume directories).
Regarding your specific questions:
If you don't mount any volumes, would it keep any/some/most/all of the container filesystem in RAM?
Nope, there's nothing in RAM (besides the usual filesystem caches). It's all in the overlay filesystem, which are mounted using AUFS, DeviceMapper or another storage driver.
Or does it write some amount of it to disk, meaning it would be just as slow as a volume in this case?
In general, filesystem access in volumes is more performant than in the overlay filesystem. After all, a volume (at least, a regular host-based volume, letting aside volume drivers that add network storage volumes) is simply a bind mount to a regular directory in the host filesystem, bypassing the layer filesystem entirely. The performance of volumes in comparison to layer filesystem is (among other topics) investigated in this paper:
AUFS introduces significant overhead which is not surprising since I/O is going through several layers, [...]. Applications that are filesystem or disk intensive should bypass AUFS by using volumes. [...] Although containers themselves have almost no overhead, Docker is not without performance gotchas. Docker volumes have noticeably better performance than files stored in AUFS.
I've recently discovered docker. It looks very useful for us.
But what I don't understand is the role of the registry beyond getting initial docker images. We'll likely be starting with some images based on those from docker.io, but will be customizing those and adding some private closed source software.
What concerns me is if the images were large enough then could I run out of space on my / drive.
Can /var/lib/docker just be a mount to a shared file system like cephfs or nfs?
I'm also interested in using CoreOS in a PXE or iPXE configuration. It appears that in that scenario / is mounted as tmpfs up to 50% RAM which is needlessly wasteful for pulling images that could be available on a shared file system. However I've read comments that for some reason /var/lib/docker needs to be on btrfs. Is this true? why?
Ok I've found an answer to my last question. CoreOS requires /var/lib/docker to be mounted on btrfs because it uses the btrfs backend. This backend uses btrfs snapshots to implement the layers docker uses to represent it's image.
Which helps with my second question. Can /var/lib/docker just be a mount to a shared file system. By the looks of it, no. Not unless the super slow vfs backend is used.
It's easy and cheap to store your registry in S3.
I would recommend against mounting /var/lib/docker on nfs. If someone hammers the nfs, all your services will essentially stop working, since the file systems of the containers live there.
I have a physical host machine with Ubuntu 14.04 running on it. It has 100G disk and 100M network bandwidth. I installed Docker and launched 10 containers. I would like to limit each container to a maximum of 10G disk and 10M network bandwidth.
After going though the official documents and searching on the Internet, I still can't find a way to allocate specified size disk and network bandwidth to a container.
I think this may not be possible in Docker directly, maybe we need to bypass Docker. Does this means we should use something "underlying", such as LXC or Cgroup? Can anyone give some suggestions?
Edit:
#Mbarthelemy, your suggestion seems to work but I still have some questions about disk:
1) Is it possible to allocate other size (such as 20G, 30G etc) to each container? You said it is hardcoded in Docker so it seems impossible.
2) I use the command below to start the Docker daemon and container:
docker -d -s devicemapper
docker run -i -t training/webapp /bin/bash
then I use df -h to view the disk usage, it gives the following output:
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/docker-longid 9.8G 276M 9.0G 3% /
/dev/mapper/Chris--vg-root 27G 5.5G 20G 22% /etc/hosts
from the above I think the maximum disk a container can use is still larger than 10G, what do you think
?
I don't think this is possible right now using Docker default settings. Here's what I would try.
About disk usage: You could tell Docker to use the DeviceMapper storage backend instead of AuFS. This way each container would run on a block device (Devicemapper dm-thin target) limited to 10GB (this is a Docker default, luckily enough it matches your requirement!).
According to this link, it looks like latest versions of Docker now accept advanced storage backend options. Using the devicemapperbackend, you can now change the default container rootfs size option using --storage-opt dm.basesize=20G (that would be applied to any newly created container).
To change the storage backend: use the --storage-driver=devicemapper Docker option. Note that your previous containers won't be seen by Docker anymore after the change.
About network bandwidth : you could tell Docker to use LXC under the hoods : use the -e lxcoption.
Then, create your containers with a custom LXC directive to put them into a traffic class :
docker run --lxc-conf="lxc.cgroup.net_cls.classid = 0x00100001" your/image /bin/stuff
Check the official documentation about how to apply bandwidth limits to this class.
I've never tried this myself (my setup uses a custom OpenVswitch bridge and VLANs for networking, so bandwidth limitation is different and somewhat easier), but I think you'll have to create and configure a different class.
Note : the --storage-driver=devicemapperand -e lxcoptions are for the Docker daemon, not for the Docker client you're using when running docker run ........
New releases version has --device-read-bps and --device-write-bps.
You can use:
docker run --device-read-bps=/dev/sda:10mb
More info here:
https://blog.docker.com/2016/02/docker-1-10/
If you have access to the containers you can use tc for bandwidth control within them.
eg: in your entry point script you can add:
tc qdisc add dev eth0 root tbf rate 240kbit burst 300kbit latency 50ms
to have a bandwidth of 240kbps, burst 300kbps and 50 ms latency.
You also need to pass the --cap-add=NET_ADMIN to the docker run command if you are not running the containers as root.
1) Is it possible to allocate other size (such as 20G, 30G etc) to each container? You said it is hardcoded in Docker so it seems impossible.
to answer this question please refer to Resizing Docker containers with the Device Mapper plugin