Why adding top level keys in docker compose? - docker

I find top-level keys like volumes and networks in many docker compose yml files, for example here and in this repository:
networks:
network:
volumes:
db:
Only keys are declared and no values are found. I notice that all these two keywords have already been defined in services, then I wonder why adding those keywords globally again, and should they already have appeared in predefined services?
GPT answered that:
Adding top-level keys to Docker Compose allows you to define multiple services that can be run together in an application. This can be useful when creating complex applications with multiple components that are connected to each other. It also provides an easy way to configure and scale your application.
I don't think its first sentence is correct since without those keys I can also define multiple services that can cooperate in an application.
Could anyone please verify that? Thanks.

These things can be configured. One example is adding configuration to a specific networks:. Say you have a set of containers, but you also need to interact with containers in another Compose setup. You could define that other network with a specific name:
networks:
other:
external: true
name: other_default
services:
one:
networks: [default, other]
two:
networks: [default, other]
This saves us from repeating the configuration every time a network or volume appears, since they can appear in multiple services.
In principle it would be possible for Compose to scan the list of services and create everything with default settings if required. Requiring the top-level lists of volumes: and networks: would simplify the implementation a little bit. It is also a little bit of protection against typos; Docker doesn't give you a lot of that, but if you
services:
one:
volumes:
- exchange:/data
two:
volumes:
- echxange:/data
volumes:
exchange:
Compose will notice that the misspelled echxange doesn't exist and complain.

Top-level elements, for instance networks top-level element, volumes top-level element, are just like the mostly used services top-level element, and they are used to define networks and volumes which can be used in services defined in the services top-level element.

Related

Docker: Multiple Compositions

I've seen many examples of Docker compose and that makes perfect sense to me, but all bundle their frontend and backend as separate containers on the same composition. In my use case I've developed a backend (in Django) and a frontend (in React) for a particular application. However, I want to be able to allow my backend API to be consumed by other client applications down the road, and thus I'd like to isolate them from one another.
Essentially, I envision it looking something like this. I would have a docker-compose file for my backend, which would consist of a PostgreSQL container and a webserver (Apache) container with a volume to my source code. Not going to get into implementation details but because containers in the same composition exist on the same network I can refer to the DB in the source code using the alias in the file. That is one environment with 2 containers.
On my frontend and any other future client applications that consume the backend, I would have a webserver (Apache) container to serve the compiled static build of the React source. That of course exists in it's own environement, so my question is like how do I converge the two such that I can refer to the backend alias in my base url (axios, fetch, etc.) How do you ship both "environments" to a registry and then deploy from that registry such that they can continue to communicate across?
I feel like I'm probably missing the mark on how the Docker architecture works at large but to my knowledge there is a default network and Docker will execute the composition and run it on the default network unless otherwise specified or if it's already in use. However, two separate compositions are two separate networks, no? I'd very much appreciate a lesson on the semantics, and thank you in advance.
There's a couple of ways to get multiple Compose files to connect together. The easiest is just to declare that one project's default network is the other's:
networks:
default:
external:
name: other_default
(docker network ls will tell you the actual name once you've started the other Compose project.) This is also suggested in the Docker Networking in Compose documentation.
An important architectural point is that your browser application will never be able to use the Docker hostnames. Your fetch() call runs in the browser, not in Docker, and so it needs to reach a published port. The best way to set this up is to have the Apache server that's serving the built UI code also run a reverse proxy, so that you can use a same-server relative URL /api/... to reach the backend. The Apache ProxyPass directive would be able to use the Docker-internal hostnames.
You also mention "volume with your source code". This is not a Docker best practice. It's frequently used to make Docker simulate a local development environment, but it's not how you want to deploy or run your code in production. The Docker image should be self-contained, and your docker-compose.yml generally shouldn't need volumes: or a command:.
A skeleton layout for what you're proposing could look like:
version: '3'
services:
db:
image: postgres:12
volumes:
- pgdata:/var/lib/postgresql/data
backend:
image: my/backend
environment:
PGHOST: db
# No ports: (not directly exposed) (but it could be)
# No volumes: or command: (use what's in the image)
volumes:
pgdata:
version: '3'
services:
frontend:
image: my/frontend
environment:
BACKEND_URL: http://backend:3000
ports:
- 8080:80
networks:
default:
external:
name: backend_default

Is it possible to read network name in docker-compose from env?

I'm trying to not hard-code my network name since its for an open source project (and I have multiple instances running on the same server for different apps).
Is it possible to use environment variables when defining the network?
This doesn't work:
networks:
${DOCKER_NETWORK_NAME}:
name: ${DOCKER_NETWORK_NAME}
Compose has an internal notion of a project name and most Docker object names are prefixed with that name. For example, if you are in a directory named foo and your Compose file has
networks:
something:
and you run docker network ls, you will see a network named foo_something.
I would generally recommend not manually specifying the names of networks, volumes, or containers. You can choose any name you want to be used within the docker-compose.yml file and it will be scoped to that file.
Conversely, this requires that different installations of the system either be in directories with different names, set the COMPOSE_PROJECT_NAME environment variable (possibly in a .env file), or consistently use the docker-compose -p flag.
In the very specific case of networks, Compose provides a network named default which is the default if you don't actually have networks: blocks. There's not really any downside to using this, and most applications won't need multiple internal networks. I'd just leave out networks: entirely.

Docker - using labels to influence the start-up sequence

My Django application uses Celery to process tasks on a regular basis. Sadly this results in having 3 continers (App, Celery Worker, Celery Beat) each of them having a very own startup shell-script instead of a docker entrypoint script.
So my Idea was to have a single entrypoint script which is able to process the lables I enter at my docker-compose.yml. Based on the lables the container should start as App, Celery Beat or Celery Worker instance.
I never did such a Implementation before but asking myself if this is even possible as I saw something similar at the trafik loadblancer project, see e.g.:
loadbalancer:
image: traefik:1.7
command: --docker
ports:
- 80:80
volumes:
- /var/run/docker.sock:/var/run/docker.sock
networks:
- frontend
- backend
labels:
- "traefik.frontend.passHostHeader=false"
- "traefik.docker.network=frontend"
...
I didn't found any good material according to that on the web or on how to implement such a scenario or if it's even possible the way I think here. Does smb did it like that befor or should I better stay with 3 single shell scripts, one for each service?
You can access the labels from within the container, but it does not seem to be as straight forward as other options and I do not recommend it. See this StackOverflow question.
If your use cases (== entrypoints) are more different than alike, it is probably easier to use three entrypoints or three commands.
If your use cases are more similar, then it is easier and clearer to simply use environment variables.
Another nice alternative that I like to use, it to create one entrypoint shell script, that accept arguments - so you have one entrypoint, and the arguments are provided using the command definition.
Labels are designed to be used by the docker engine and other applications that work at the host or docker-orchestrator level, and not at the container level.
I am not sure how the traefik project is using that implementation. If they use it, it should be totally possible.
However, I would recommend using environment variables instead of docker labels. Environment variables are the recommended way to processes configuration parameters in a cloud-native app. The use of the labels is more related to the service metadata, so you can identify and filter specific services. In your scenario you can have something like this:
version: "3"
services:
celery-worker:
image: generic-dev-image:latest
environment:
- SERVICE_TYPE=celery-worker
celery-beat:
image: generic-dev-image:latest
environment:
- SERVICE_TYPE=celery-beat
app:
image: generic-dev-image:latest
environment:
- SERVICE_TYPE=app
Then you can use the SERVICE_TYPE environment variable in your docker entrypoint to launch the specific service.
However (again), there is nothing wrong with having 3 different docker images. In fact, that's the idea of containers (and microservices). You encapsulate the processes in images and instantiate them in containers. Each one of them will have different purposes and lifecycles. For development purposes, there is nothing wrong with your implementation. But in production, I would recommend separating the services in different images. Otherwise, you have big images, only using a third of the functionality in each service, and hard coupling the livecycle of the services.

Micro Services With Docker Compose: Same Container, Multiple Projects

Along with a few others, I am having issues using a micro services architecture of applications and employing docker-compose the way I want to.
Summary:
I have X micro service projects (lets call these project A, project B and project C. Each micro service depends on the same containers (lets call these dependency D and dependency E.
The Problem:
Ideally, project A, B and C would ALL have both dependencies (D & E) in their docker-compose.yml files; however, this becomes an issue as docker compose sees these as duplicate containers when in reality, I would like to reuse them. Here is an error message that is commonly seen:
ERROR: for A Cannot create container for service A: b'Conflict. The
container name "/A" is already in use by container "sha". You have to
remove (or rename) that container to be able to reuse that name.'
From what I have seen, people are recommending that you define the container in one project and reference it using networks and external links. Although this works, it introduces a dependency on a different docker-compose yml file (the file that defines the dependency!).
Another approach that I've read argues for isolation of containers in their docker compose files and then referencing multiple files when you want to build. Again, although this works, its certainly not as stunningly convenient as docker typically is to work with. If I am unable to work out a solution, I will go with this approach.
Have other people in the non-mono repo world (specifically with micro services) had any success with a different approach?
I've been ask to clarify with some examples:
Here is what 2 different compose yml files look like for project A and project B:
Project A:
version: '2'
services:
dependencyD:
image: dependencyD:latest
container_name: dependencyD
dependencyE:
image: dependencyE:latest
container_name: dependencyE
projectA:
image: projectA:latest
container_name: projectA
depends_on:
- dependencyD
- dependencyE
Project B:
version: '2'
services:
dependencyD:
image: dependencyD:latest
container_name: dependencyD
dependencyE:
image: dependencyE:latest
container_name: dependencyE
projectB:
image: projectB:latest
container_name: projectB
depends_on:
- dependencyD
- dependencyE
There is a feature called external links. From the docs:
Link to containers started outside this docker-compose.yml or even outside of Compose, especially for containers that provide shared or common services.
Having multiple docker-compose.yml files is also common to organize containers into meaningful groups. Maybe your scenario can use multiple YAML files and the external links.

Docker swarm having some shared volume

I will try to describe my desired functionality:
I'm running docker swarm over docker-compose
In the docker-compose, I've services,for simplicity lets call it A ,B ,C.
Assume C service that include shared code modules need to be accessible for services A and B.
My questions are:
1. Should each service that need access to the shared volume must mount the C service to its own local folder,(using the volumes section as below) or can it be accessible without mounting/coping to a path in local container.
In docker swarm, it can be that 2 instances of Services A and B will reside in computer X, while Service C will reside on computer Y.
Is it true that because the services are all maintained under the same docker swarm stack, they will communicate without problem with service C.
If not which definitions should it have to acheive it?
My structure is something like that:
version: "3.4"
services:
A:
build: .
volumes:
- C:/usr/src/C
depends_on:
- C
B:
build: .
volumes:
- C:/usr/src/C
depends_on:
- C
C:
image: repository.com/C:1.0.0
volumes:
- C:/shared_code
volumes:
C:
If what you’re sharing is code, you should build it into the actual Docker images, and not try to use a volume for this.
You’re going to encounter two big problems. One is getting a volume correctly shared in a multi-host installation. The second is a longer-term issue: what are you going to do if the shared code changes? You can’t just redeploy the C module with the shared code, because the volume that holds the code already exists; you need to separately update the code in the volume, restart the dependent services, and hope they both work. Actually baking the code into the images makes it possible to test the complete setup before you try to deploy it.
Sharing code is an anti-pattern in a distributed model like Swarm. Like David says, you'll need that code in the image builds, even if there's duplicate data. There are lots of ways to have images built on top of others to limit the duplicate data.
If you still need to share data between containers in swarm on a file system, you'll need to look at some shared storage like AWS EFS (multi-node read/write) plus REX-Ray to get your data to the right containers.
Also, depends_on doesn't work in swarm. Your apps in a distributed system need to handle the lack of connection to other services in a predicable way. Maybe they just exit (and swarm will re-create them) or go into a retry loop in code, etc. depends_on is mean for local docker-compose cli in development where you want to spin up a app and its dependencies by doing something like docker-compose up api.

Resources