How to create docker container with custom root volume size? - docker

I have hit a limit of 10Gb for default root volume size. For this particular container I need a larger size.
So far I've seen quite dirty hacks to override default size.
Could somebody provide me and the community with a clear example of specifying bigger volume size upon container creation? Thanks!

I'm going to offer an alternative suggestion - don't. Stop and ask yourself why you need a larger root volume. I would suggest it's likely to be because you're doing something that can be done a better way.
I would suggest that instead - you use a storage container (if another 10G would be sufficient) or use a passthrough mount to a local disk.
The problem with big containers is they're somewhat at odds with what containerisation is trying to accomplish - compact, lightweight and self contained program instances.
So I would suggest instead:
docker create -v /path/to/storage:/container_mount --name storage_for_my_app /bin/true
(Or you can just -v /container_mount to keep the data within the container)
Then when you fire up your container:
docker run -d --volumes-from storage_for_my_app your_image
However it may be useful to note - as of Docker 1.9, the size limit is 100G instead: https://docs.docker.com/engine/reference/commandline/daemon/

Related

Docker - Limit mounted volume size

Is there any way to limit the size that a mounted docker volume can grow to? I'm thinking of doing it like it's done here: How to set limit on directory size in Linux? but I feel it's a bit too convoluted for what I need.
By default when you mount a host directory/volume into your Docker container while running it, the Docker container gets full access to that directory and can use as much space as is available.
The way you're trying is yeah, tedious.
What you can do is, for, eg. maybe mount a new partition to your server, (maybe an EBS to your EC2 instance) with limited size and mount that inside your container and that will suit your purpose.

Huge files in Docker containers

I need to create a Docker image (and consequently containers from that image) that use large files (containing genomic data, thus reaching ~10GB in size).
How am I supposed to optimize their usage? Am I supposed to include them in the container (such as COPY large_folder large_folder_in_container)? Is there a better way of referencing such files? The point is that it sounds strange to me to push such container (which would be >10GB) in my private repository. I wonder if there is a way of attaching a sort of volume to the container, without packing all those GBs together.
Thank you.
Is there a better way of referencing such files?
If you already have some way to distribute the data I would use a "bind mount" to attach a volume to the containers.
docker run -v /path/to/data/on/host:/path/to/data/in/container <image> ...
That way you can change the image and you won't have to re-download the large data set each time.
If you wanted to use the registry to distribute the large data set, but want to manage changes to the data set separately, you could use a data volume container with a Dockerfile like this:
FROM tianon/true
COPY dataset /dataset
VOLUME /dataset
From your application container you can attach that volume using:
docker run -d --name dataset <data volume image name>
docker run --volumes-from dataset <image> ...
Either way, I think https://docs.docker.com/engine/tutorials/dockervolumes/ are what you want.
Am I supposed to include them in the container (such as COPY large_folder large_folder_in_container)?
If you do so, that would include them in the image, not the container: you could launch 20 containers from that image, the actual disk space used would still be 10 GB.
If you were to make another image from your first image, the layered filesystem will reuse the layers from the parent image, and the new image would still be "only" 10GB.
I have was having trouble with a 900MB json file and changing the Memory limit in the preferences and it fixed it.

Running quick programs inside a docker container

My web-application uses graphicsmagick to resize images. Resizing an image will usually take about 500ms. To simpilfy the setup and wall it off I wanted to move the graphicsmagick call inside a docker container and use docker run to execute it. However, running inside a container adds an additional ~300ms, which is not really acceptable for my use case.
To reduce the overhead of starting a container one could run an endless program (sth. like docker run tail -f /dev/null) and then use docker exec to execute the actual call to graphicsmagick inside the running container. However this seems like a big hack.
How would one fix this problem "correctly" with docker or is it just not the right fit here?
This sounds like a good candidate for a microservice, a long-lived server (implemented in your language of choice) listening on a port for resizing requests that uses graphicsmagick under the hood.
Pros:
Implementation details of the resizer are kept inside the container.
Scalability - you can spin up more containers and put them behind a load balancer.
Cons:
You have to implement a server
Pro or Con:
You've started down the road to a microservices architecture.
Using docker exec may be possible, but it's really not intended to be used that way. Cut out the middleman and talk to your container directly.
The best ways is:
create custom deb/rpm package
use one from system repos
But if you like more docker approach, the best visible way is to (example below):
Start "daemon":
docker run --name ubuntu -d -v /path/to/host/folder:/path/to/guest/folder ubuntu:14.04 sleep infinity
Execute command:
docker exec ubuntu <any needed command>
Where:
"ubuntu" - name of the container
-d - de-attach container
-v - mount volume host -> container
sleep infinity - do nothing, used as entry point and is a way better than read operation.
Use mounted volume in case, if you working with files, if not - do not mount volume and just use pipes.

Volume and data persistence

What is the best way to persist containers data with docker? I would like to be able to retain some data and be able to get them back when restarting my container. I have read this interesting post but it does not exactly answer my question.
As far as I understand, I only have one option:
docker run -v /home/host/app:/home/container/app
This will mount the countainer folder onto the host.
Is there any other option? FYI, I don't use linking containers (--link )
Using volumes is the best way of handling data which you want to keep from a container. Using the -v flag works well and you shouldn't run into issues with this.
You can also use the VOLUME instruction in the Dockerfile which means you will not have to add any more options at run time, however they're quite tightly coupled with the specific container, you'd need to use docker start, rather than docker run to get the data back (or of course -v to the volume which was created in the past, likely in /var/ somewhere).
A common way of handling volumes is to create a data volume container with volumes defined by -v Then when you create your app container, use the --volumes-from flag. This will make your new container use the same volumes as the container you used the -v on (your data volume container). Of course this may seem like you're shifting the issue somewhere else.
This makes it quite simple to share volumes over multiple containers. Perhaps you have a container for your application, and another for logstash.
create a volume-container: this format of -v creates a volume, directory e.g. /var/lib/docker/volume/d3b0d5b781b7f92771b7342824c9f136c883af321a6e9fbe9740e18b93f29b69
which is still a bind mounted /container/path/vol
docker run -v /foo/bar/vol --name volbox ubuntu
I can now use this container, as my volume.
docker run --volumes-from volbox --name foobox ubuntu /bin/bash
root#foobox# ls /container/path/vol
Now, if I distribute these two containers, they will just work. The volume will always be available to foobox, regardless which host it is deployed to.
The snag of course comes if you don't want your storage to be in /var/lib/docker/volumes...
I suggest you take a look at some of the excellent post by Michael Crosby
https://docs.docker.com/userguide/dockervolumes/
and the docker docs
https://docs.docker.com/userguide/dockervolumes/

Limit disk size and bandwidth of a Docker container

I have a physical host machine with Ubuntu 14.04 running on it. It has 100G disk and 100M network bandwidth. I installed Docker and launched 10 containers. I would like to limit each container to a maximum of 10G disk and 10M network bandwidth.
After going though the official documents and searching on the Internet, I still can't find a way to allocate specified size disk and network bandwidth to a container.
I think this may not be possible in Docker directly, maybe we need to bypass Docker. Does this means we should use something "underlying", such as LXC or Cgroup? Can anyone give some suggestions?
Edit:
#Mbarthelemy, your suggestion seems to work but I still have some questions about disk:
1) Is it possible to allocate other size (such as 20G, 30G etc) to each container? You said it is hardcoded in Docker so it seems impossible.
2) I use the command below to start the Docker daemon and container:
docker -d -s devicemapper
docker run -i -t training/webapp /bin/bash
then I use df -h to view the disk usage, it gives the following output:
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/docker-longid 9.8G 276M 9.0G 3% /
/dev/mapper/Chris--vg-root 27G 5.5G 20G 22% /etc/hosts
from the above I think the maximum disk a container can use is still larger than 10G, what do you think
?
I don't think this is possible right now using Docker default settings. Here's what I would try.
About disk usage: You could tell Docker to use the DeviceMapper storage backend instead of AuFS. This way each container would run on a block device (Devicemapper dm-thin target) limited to 10GB (this is a Docker default, luckily enough it matches your requirement!).
According to this link, it looks like latest versions of Docker now accept advanced storage backend options. Using the devicemapperbackend, you can now change the default container rootfs size option using --storage-opt dm.basesize=20G (that would be applied to any newly created container).
To change the storage backend: use the --storage-driver=devicemapper Docker option. Note that your previous containers won't be seen by Docker anymore after the change.
About network bandwidth : you could tell Docker to use LXC under the hoods : use the -e lxcoption.
Then, create your containers with a custom LXC directive to put them into a traffic class :
docker run --lxc-conf="lxc.cgroup.net_cls.classid = 0x00100001" your/image /bin/stuff
Check the official documentation about how to apply bandwidth limits to this class.
I've never tried this myself (my setup uses a custom OpenVswitch bridge and VLANs for networking, so bandwidth limitation is different and somewhat easier), but I think you'll have to create and configure a different class.
Note : the --storage-driver=devicemapperand -e lxcoptions are for the Docker daemon, not for the Docker client you're using when running docker run ........
New releases version has --device-read-bps and --device-write-bps.
You can use:
docker run --device-read-bps=/dev/sda:10mb
More info here:
https://blog.docker.com/2016/02/docker-1-10/
If you have access to the containers you can use tc for bandwidth control within them.
eg: in your entry point script you can add:
tc qdisc add dev eth0 root tbf rate 240kbit burst 300kbit latency 50ms
to have a bandwidth of 240kbps, burst 300kbps and 50 ms latency.
You also need to pass the --cap-add=NET_ADMIN to the docker run command if you are not running the containers as root.
1) Is it possible to allocate other size (such as 20G, 30G etc) to each container? You said it is hardcoded in Docker so it seems impossible.
to answer this question please refer to Resizing Docker containers with the Device Mapper plugin

Resources