Understanding volumn in docker compose - docker

The following is an example given in https://docker-curriculum.com/
version: "3"
services:
es:
image: docker.elastic.co/elasticsearch/elasticsearch:6.3.2
container_name: es
environment:
- discovery.type=single-node
ports:
- 9200:9200
volumes:
- esdata1:/usr/share/elasticsearch/data
web:
image: prakhar1989/foodtrucks-web
command: python app.py
depends_on:
- es
ports:
- 5000:5000
volumes:
- ./flask-app:/opt/flask-app
volumes:
esdata1:
driver: local
and it says The volumes parameter specifies a mount point in our web container where the code will reside about the /opt/flask-app
I think it means, /opt/flask-app is a mount point and it points to the host machines ./flask-app
However it doesn't say anything about esdata1 and I can't apply the same explanation as given to /opt/flask-app since there's no esdata1 directory/file in the host machine.
What is happening for the esdata1 ?
My guess is that it means creating a volume (The closest thing I can think of is a disk partition) and name it esdata1 and mount it on /usr/share/elasticsearch/data, am I correct on this guess?

These are a bit different things: volumes and bind mounts. Bind mounts let you specify folder on host machine, which would serve as a storage. Volumes (at lease for local driver) also have folders on host machines, but their location is managed by Docker and is sometimes a bit more difficult to find.
When you specify volume in docker-compose.yml, if your path starts with / or . it becomes a bind mount, like in web service. Otherwise, if it is a single verb, it is a volume, like for es service.
You can inspect all volumes on your host machine by running docker volume ls.
What is happening for the esdata1 ? My guess is that it means creating
a volume (The closest thing I can think of is a disk partition) and
name it esdata1 and mount it on /usr/share/elasticsearch/data, am I
correct on this guess?
That's all correct.
I do not pretend on setting up the rules, but in general, volumes are more suitable for sharing common data between several containers, like in docker-compose, while bind mounts suite better for sharing data from host to container, like some initial configs for services.

Related

define volumes in docker-compose.yaml

I am writing a docker-compose.yaml file for my project. I have checked the volumes documentation here .
I also understand the concept of volume in docker that I can mount a volume e.g. -v my-data/:/var/lib/db where my-data/ is a directory on my host machine while /var/lib/db is the path inside database container.
My confuse is with the link I put above. There it has the following sample:
version: "3.9"
services:
db:
image: db
volumes:
- data-volume:/var/lib/db
backup:
image: backup-service
volumes:
- data-volume:/var/lib/backup/data
volumes:
data-volume:
I wonder does it mean that I have to create a directory named data-volume on my host machine? What if I have a directory on my machine with path temp/my-data/ and I want to mount that path to the database container /var/lib/db ? Should I do something like below?
version: "3.9"
services:
db:
image: db
volumes:
- temp/my-data/:/var/lib/db
volumes:
temp/my-data/:
My main confusion is the volumes: section at the bottom, I am not sure whether the volume name should be the path of my directory or should be just literally a name I give & if it is the latter case then how could the given name be mapped with temp/my-data/ on my machine? The sample doesn't indicate that & is ambiguous to clarify that.
Could someone please clarify it for me?
P.S. I tried with above docker-compose I guessed, ended up with the error:
ERROR: The Compose file './docker-compose.yaml' is invalid because:
volumes value 'temp/my-data/' does not match any of the regexes: '^[a-zA-Z0-9._-]+$'
Mapped volumes can either be files/directories on the host machine (sometimes called bind mounts in the documentation) or they can be docker volumes that can be managed using docker volume commands.
The volumes: section in a docker-compose file specify docker volumes, i.e. not files/directories. The first docker-compose in your post uses such a volume.
If you want to map a file or directory (like in your last docker-compose file), you don't need to specify anything in the volumes: section.
Docker volumes (the ones specified in the volumes: section or created using docker volume create) are of course also stored somewhere on your host computer, but docker manages that and you shouldn't normally need to know where or what the format is.
This part of the documentation is pretty good about explaining it, I think https://docs.docker.com/storage/volumes/
As #HansKilian mentions, you don't need both volumes and services.volumes. To use services.volumes, map the host directory to the container directory like this:
services:
db:
image: db
volumes:
- /host/path/lib/db:/container/path/lib/db
With that, the directory /host/path/lib/db on the host machine will be used by the container and available at /container/path/lib/db.
Now, if you're like me, I get really confused with fake examples, so let's say the real directory on your host machine is /var/lib/db and you just want to see it at /db when you run a shell in Docker (i.e., docker exec -it /bin/bash container-id).
docker-compose.yaml would look like this:
services:
db:
image: db
volumes:
- /var/lib/db:/db
Now when you run the shell, cd /logs and ls, you'll see the same results as if you'd cd /var/lib/db on the host.
If you want to use the volumes section to indicate a global volume to use, you first have to create that volume using docker volume create. The documentation Hans linked includes steps to do this. The syntax of /host/path:/container/path is replaced by volume-name:/container/path. Then, once defined, you'd alter your docker-compose.yaml to be more like this:
services:
db:
image: db
volumes:
- your-global-volume-name:/db
volumes:
your-global-volume-name:
external: true
Note that I have not tested or used the this configuration. I'm assuming it's correct based on the other method working and the few changes I can identify in the docs.

Docker change location of named volumes

I have a problem that I just can't understand. I am using docker to run certain containers, but I have problems with at least one Volume, where I't like to ask if anybody can give me a hint what I am doing wrong. I am using Nifi-Ingestion as example, but it affects even more container volumes.
First, let's talk about the versions I use:
Docker version 19.03.8, build afacb8b7f0
docker-compose version 1.27.4, build 40524192
Ubuntu 20.04.1 LTS
Now, let's show the volume in my working docker-compose-file:
In my container, it is configured as followed:
volumes:
- nifi-ingestion-conf:/opt/nifi/nifi-current/conf
Below my docker-compose file it is defined as a normal named volume:
volumes:
nifi-ingestion-conf:
This is a snippet from the docker-compose that I'd like to get working
In my container, it is configured in this case as followed (having my STORAGE_VOLUME_PATH defined as /mnt/storage/docker_data):
volumes:
- ${STORAGE_VOLUME_PATH}/nifi-ingestion-conf:/opt/nifi/nifi-current/conf
On the bottom I guess there is something to do but I don't know what I could need to do here. In this case it is the same as in the working docker-compose:
volumes:
nifi-ingestion-conf:
So, now whats my problem?
I have two docker-compose files. One uses the normal named volumes, and one uses the volumes in my extra mount path. When I run the containers, the volumes seem to work different since files are written in the first style, but not in the second. My mount paths are generated in the second version so there is nothing wrong with my environment variables in the .env-file.
Hint: the /mnt/storage/docker_data is an NFS-mount but my machine has the full privileges on that share.
Here is my fstab-entry to mount that volume (maybe I have to set other options):
10.1.0.2:/docker/data /mnt/storage/docker_data nfs auto,rw
Bigger snippets
Here is a bigger snipped if the docker-compose (i need to cut and remove confident data, my problem is not that it does not work, it is only that the volume acts different. Everything for this one volume is in the code.):
version: "3"
services:
nifi-ingestion:
image: my image on my personal repo
container_name: nifi-ingestion
ports:
- 0000
labels:
- app-specivic
volumes:
- ${STORAGE_VOLUME_PATH}/nifi-ingestion-conf:/opt/nifi/nifi-current/conf
#working: - nifi-ingestion-conf:/opt/nifi/nifi-current/conf
environment:
- app-specivic
networks:
- cnetwork
volumes:
nifi-ingestion-conf:
networks:
cnetwork:
external: false
ipam:
driver: default
config:
- subnet: 192.168.1.0/24
And here of the env (only the value we are using)
STORAGE_VOLUME_PATH=/mnt/storage/docker_data
if i understand your question correctly, you wonder why the following docker-compose snippet works for you
version: "3"
services:
nifi-ingestion:
volumes:
- nifi-ingestion-conf:/opt/nifi/nifi-current/conf
volumes:
nifi-ingestion-conf:
and the following docker-compose snippet does not work for you
version: "3"
services:
nifi-ingestion:
volumes:
- ${STORAGE_VOLUME_PATH}/nifi-ingestion-conf:/opt/nifi/nifi-current/conf
what makes them different is how you use volumes. you need to differentiate between mount host paths and mount named volumes
You can mount a host path as part of a definition for a single service, and there is no need to define it in the top level volumes key.
But, if you want to reuse a volume across multiple services, then define a named volume in the top-level volumes key.
named volumes are managed by docker
If you start a container with a volume that does not yet exist, Docker creates the volume for you.
also, would advise you to read this answer
update:
you might also want to read about docker nfs volumes

Re-using existing volume with docker compose

I have setup two standalone docker containers, one runs a webserver another one runs a mysql for it.
Right now I was attempting to have it working with docker-compose. All is nice and it runs well, but I was wondering how could I re-use existing volumes from the existing standalone containers that I have previously created (since I want to retain the data from them).
I saw people suggesting to use external: true command for this, but could not get the right syntax so far.
Is external: true the correct way approach for this, or should I approach this differently?
Or can I just specify the path to the volume within docker-compose.yml and make it use the old existing volume?
Yes you can do it normally, just an example below:
Set external to true and set name to the name of the volume you want to mount.
version: "3.5"
services:
transmission:
image: linuxserver/transmission
container_name: transmission
volumes:
- transmission-config:/config
- /path/to/downloads:/downloads
ports:
- 51413:51413
- 51413:51413/udp
networks:
- rede
restart: always
networks:
rede:
external: true
name: rede
volumes:
transmission-config:
external: true
name: transmission-config
Per the documentation, using the external flag allows you to use volumes created outside the scope of the docker-compose file.
However, it is advisable to create a fresh volume via the docker-compose file and copy the existing data from the old volumes to the new volumes
You can create a volume explicitly using the docker volume create command, or Docker can create a volume during container or service creation. When you create a volume, it is stored within a directory on the Docker host. When you mount the volume into a container, this directory is what is mounted into the container.
If your system is running, you can exec into the mysql container, copy and move it outside.
docker cp "${container_id}":/path_to_folder /path_to_server

docker volume type - bind vs volume

TLDR
In docker-compose, what's the difference between
volumes:
- type: volume
source: mydata
target: /data
and
volumes:
- type: bind
source: mydata
target: /data
?
The question in long:
When you specify the volumes option in your docker-compose file, you can use the long-syntax style
According to the docs, the type option accepts 3 different values: volume, bind and tmpfs:
I understand the tmpfs option - it means that the volume will not be saved after the container is down..
But I fail to find any reference in the docs about the difference between the other 2 options: bind and volume, could someone enlighten me about that?
When bind mounts are files coming from your host machine, volumes are something more like the nas of docker.
Bind mounts are files mounted from your host machine (the one that runs your docker daemon) onto your container.
Volumes are like storage spaces totally managed by docker.
You will find, in the literature, two types of volumes:
named volumes (you provide the name of it)
anonymous volumes (usual UUID names from docker, like you can find them on container or untagged images)
Those volumes come with their own set of docker commands; you can also consult this list via
docker volume --help
You can see your existing volumes via
docker volume ls
You can create a named volume via
docker volume create my_named_volume
But you can also create a volume via a docker-compose file
version: "3.3"
services:
mysql:
image: mysql
volumes:
- type: volume
source: db-data
target: /var/lib/mysql/data
volumes:
db-data:
Where this is the part saying please docker, mount me the volume named db-data on top of the container directory /var/lib/mysql/data
- type: volume
source: db-data
target: /var/lib/mysql/data
And this is the part saying to docker please create me a volume named db-data
volumes:
db-data:
Docker documentation about the three mount types:
https://docs.docker.com/storage/bind-mounts/
https://docs.docker.com/storage/volumes/
https://docs.docker.com/storage/tmpfs/
If I understood you correctly, you're asking in other words: What is the difference between Volumes and bind mounts?
Differences in management and isolation on the host
Bind mounts exist on the host file system and being managed by the host maintainer. Applications / processes outside of Docker can also modify it.
Volumes can also be implemented on the host, but Docker will manage them for us and they can not be accessed outside of Docker.
Volumes are a much wider solution
Although both solutions help us to separate the data lifecycle from containers,
by using Volumes you gain much more power and flexibility over your system.
With Volumes we can design our data effectively and decouple it from the host and other parts of the system by storing it dedicated remote locations (Cloud for example) and integrate it with external services like backups, monitoring, encryption and hardware management.
More Volumes advantages over bind mounts:
No host concerns.
Can be managed using Docker CLI.
Volumes can save you some uid/gid issues related permissions which occur in cases like when a container user's uid does not match the host gid.
A new volume’s contents can be pre-populated by a container.
Examples
Lets take 2 scenarios.
Case 1: Web server.
We want to provide our web server a configuration file that might change frequently. For example: exposing ports according to the current environment.
We can rebuild the image each time with the relevant setup or create 2 different images for each environment. Both of this solutions aren’t very efficient.
With Bind mounts Docker mounts the given source directory into a location inside the container.
(The original directory / file in the read-only layer inside the union file system will simply be overridden).
For example - binding a dynamic port to nginx:
version: "3.7"
services:
web:
image: nginx:alpine
volumes:
- type: bind #<-----Notice the type
source: ./mysite.template
target: /etc/nginx/conf.d/mysite.template
ports:
- "9090:8080"
environment:
- PORT=8080
command: /bin/sh -c "envsubst < /etc/nginx/conf.d/mysite.template >
/etc/nginx/conf.d/default.conf && exec nginx -g 'daemon off;'"
(*) Notice that this example could also be solved using Volumes.
Case 2 : Databases.
Docker containers do not store persistent data: any data that will be written to the writable layer in container’s union file system will be lost once the container stop running.
But what if we have a database running on a container, and the container stops - that means that all the data will be lost?
Volumes to the rescue.
Those are named file system trees which are managed for us by Docker.
For example - persisting Postgres SQL data:
services:
db:
image: postgres:latest
volumes:
- "dbdata:/var/lib/postgresql/data"
volumes:
- type: volume #<-----Notice the type
source: dbdata
target: /var/lib/postgresql/data
volumes:
dbdata:
Notice that in this case, for named volumes, the source is the name of the volume
(For anonymous volumes, this field is omitted).
Feature
Bind
Volume                                 
Internal soul
Bind mounts attach a user-specified location on host filesystem to a specific point in a container file tree.
Volume attach with disk storage on the host filesystem or cloud storage.
command
--mount type=bind,src="",dst=""
Docker CLI docker volume command
Dependency
dependent on location on to the host filesystem.
Container-independent data management
Separation of concerns
No
Yes
Conflict with other containers
Yes Example: multiple instances of Cassandra that all use the same host location as a bind mount for data storage. In that case, each of the instances would compete for the same set of files. Without other tools such as file locks, that would likely result in corruption of the database.
No. By default, Docker creates volumes by using the local volume plugin.
When to choose
1- Bind mounts are useful when the host provides a file or directory that is needed by a program running in a container, or when that containerized program produces a file or log that is processed by users or programs running outside containers. 2- appropriate tools for workstations, machines with specialized concerns 3- systems with more traditional configuration management tooling.
Working with Persistent storage 1. Databases 2. Cloud storage
When not to choose
Better to avoid these kinds of specific bindings in generalized platforms or hardware pools.
To be written

What is the difference between volumes-from and volumes?

I saw the docker-compose patterns but I'm confused. What is the best way to make composed containers.
When should I use link, or volumes_from.
When should I use volumes_from, volumes
#1 app-db-data
app:
image: someimage
link:
- db // data volume container name
db:
image: mysql
volumes_from:
- data // data volume name
data:
image: someimage
volumes:
- {host data}:{guest data}
#2 app-db+data
app:
image: someimage
link:
- db // data volume container name
db:
image: mysql
volumes:
- data // data file name
app
#1 app-service-data
app:
image: someimage
volumes_from:
- service // service container name
service:
image: mysql
volumes_from:
- data // image container name
data:
image: someimage
volumes:
- {host data}:{guest data}
#2 app-service+data
app:
image: someimage
volumes_from:
- service // service container name
service:
image: mysql
volumes:
- data // mounted file
Thanks
In short:
volumes_from mounts from other containers.
volumes mounts defined inline.
links connects containers.
A little bit more explained:
volumes_from mounts volumes from other containers. For example if you have data only containers and you want to mount these data only containers in the container that has your application code.
volumes is a the inline way to define and mount volumes. If you read #17798 you can see that named volumes can replace data only containers in most cases.
The simplest is then to use volumes. Since you can reuse them by naming them.
links is different. Because it does not mount. Instead it connects containers. So if you do:
app:
container_name: app_container
links:
- db
That means that if you connect to app_container with docker exec -it app_container bash and try ping db you will see that container is able to resolve ip for db.
This is because docker creates a network between containers.
Link and volumes_from are different concepts. Links are used when you need to connect (by network) two containers. In this case if you want to connect an App to the Database, the way to do this is by using a link, since applications use a port and host to connect to a database (not a directory on the filesystem).
Volumes and volumes_from differ in that the first one only declares volumes that docker will make persistent or host:guest mounts, but volumes_from tells docker to use a volumes that is already declared on another host (making it available to this host).
Of those 4 cases that you present, I think that the first and second are good choices. In the first you are creating a data only container, and make the mysql container use it. In the second case the data and the mysql container are the same.
Links and volumes are perfectly explained in the docker documentation.
Hope it helps.
Addition: Volumes_from is used when you want to mount all anon-volumes of a container - named volumes could have been mounted directly since the early days.
AFAICs https://docs.docker.com/compose/compose-file/#volumes . docker-compose has removed this functionality entirely, not sure how and why and if there is an alternative. But assume, you have an app container and you have a httpd container. Usually you would define the codebase folder, /var/www, as an anon volume and then mount it in httpd to be to serve static files using the httpd service, while passing all dynamic files like ruby/php/java to an upstream backend on app.
The point in using a anon volume and not a named volume is, that actually you want to be able to redeploy app and change the codebase ( app update ) which would not work, if app would have a named volume. That said, anon volumes are doing exactly that and thats why volumes_from is used here - using named volumes is no option is this case ( as it is very practical in a a lot of other cases ).
For the reference the upgrade guides for volumes_from:
https://docs.docker.com/compose/compose-file/compose-versioning/#upgrading
So volumes_from usually is used in a different context / scenario and named-volumes are the standard in a ll other cases as explained above. A brief post about that is https://stackoverflow.com/a/44744861/3625317

Resources