docker swarm: A stack just for shared networks - docker-swarm

I have many docker compose files which describe multiple stacks (Application, Monitoring infra, Logging Infra, Some other application). Some of these stacks need to share a network.
Since the dependencies between the stacks (X needs Y to start fist, Y needs Z) are becoming more and more complicated I wanted to introduce one stack that contains all the networks that will be shared so that I can then deploy all stacks in any order.
version: "3.1"
networks:
iotivity:
proxy:
Unfortunately a compose file like this doesn't create the networks. It doesn't throw an error but nothing is created. Does someone know how I can achieve this?

You could use a dummy image. Dockerfile (copied from Mailu):
# This is an idle image to dynamically replace any component if disabled.
FROM alpine
CMD sleep 1000000d
Most probably a script is still more elegant. Just pointing out the possibility.

Related

Docker - using labels to influence the start-up sequence

My Django application uses Celery to process tasks on a regular basis. Sadly this results in having 3 continers (App, Celery Worker, Celery Beat) each of them having a very own startup shell-script instead of a docker entrypoint script.
So my Idea was to have a single entrypoint script which is able to process the lables I enter at my docker-compose.yml. Based on the lables the container should start as App, Celery Beat or Celery Worker instance.
I never did such a Implementation before but asking myself if this is even possible as I saw something similar at the trafik loadblancer project, see e.g.:
loadbalancer:
image: traefik:1.7
command: --docker
ports:
- 80:80
volumes:
- /var/run/docker.sock:/var/run/docker.sock
networks:
- frontend
- backend
labels:
- "traefik.frontend.passHostHeader=false"
- "traefik.docker.network=frontend"
...
I didn't found any good material according to that on the web or on how to implement such a scenario or if it's even possible the way I think here. Does smb did it like that befor or should I better stay with 3 single shell scripts, one for each service?
You can access the labels from within the container, but it does not seem to be as straight forward as other options and I do not recommend it. See this StackOverflow question.
If your use cases (== entrypoints) are more different than alike, it is probably easier to use three entrypoints or three commands.
If your use cases are more similar, then it is easier and clearer to simply use environment variables.
Another nice alternative that I like to use, it to create one entrypoint shell script, that accept arguments - so you have one entrypoint, and the arguments are provided using the command definition.
Labels are designed to be used by the docker engine and other applications that work at the host or docker-orchestrator level, and not at the container level.
I am not sure how the traefik project is using that implementation. If they use it, it should be totally possible.
However, I would recommend using environment variables instead of docker labels. Environment variables are the recommended way to processes configuration parameters in a cloud-native app. The use of the labels is more related to the service metadata, so you can identify and filter specific services. In your scenario you can have something like this:
version: "3"
services:
celery-worker:
image: generic-dev-image:latest
environment:
- SERVICE_TYPE=celery-worker
celery-beat:
image: generic-dev-image:latest
environment:
- SERVICE_TYPE=celery-beat
app:
image: generic-dev-image:latest
environment:
- SERVICE_TYPE=app
Then you can use the SERVICE_TYPE environment variable in your docker entrypoint to launch the specific service.
However (again), there is nothing wrong with having 3 different docker images. In fact, that's the idea of containers (and microservices). You encapsulate the processes in images and instantiate them in containers. Each one of them will have different purposes and lifecycles. For development purposes, there is nothing wrong with your implementation. But in production, I would recommend separating the services in different images. Otherwise, you have big images, only using a third of the functionality in each service, and hard coupling the livecycle of the services.

Is multiple Docker data-roots possible and how?

I have one container which needs a lot of space and I want it to use a dedicated drive on my server.
This answer comprehensively explains how to move docker data-root. But is it possible to have two data-roots and assign a specific container to the second one?
You sound like you have specific container-based needs.
Thus, moving docker data-root to another location does not seem to be the suited answer here (though you may do it anyway).
What you need are "volumes".
Wrap your image within a docker-compose file, and mount some container directories as volume pointing to some "host" path (outside of docker data-root). They must indeed be the directories that will request a lot of space, and point to a VG or external mounting point (e.g. NFS) with sufficient space !
Eg:
...
my-service:
image: my-image
volumes:
- "/path/within/host/opt/data/tmp/:/path/within/container/cache/:rw"
- "/path/within/host/opt/data/layers/:/path/within/container/layers/:rw"
- "/path/within/host/opt/data/logs/:/path/within/container/logs/:rw"
...
(note that "rw" can be omitted here, since it's the default value)

Containers with pipelines: should/can you keep your data separate from the container

I am very new to containers and I was wondering if there is a "best practice" for the following situation:
Let's say I have developed a general pipeline using multiple software tools to analyze next generation sequencing data (I work in science). I decided to make a container for this pipeline so I can share it easily with colleagues. The container would have the required tools and their dependencies installed, as well as all the scripts to run the pipeline. There would be some wrapper/master script to run the whole pipeline, something like: bash run-pipeline.sh -i input data.txt
My question is: if you are using a container for this purpose, do you need to place your data INSIDE the container OR can you run the pipeline one your data which is place outside your container? In other words, do you have to place your input data inside the container to then run the pipeline on it?
I'm struggling to find a case example.
Thanks.
To me the answer is obvious - the data belongs outside the image.
The reason is that if you build an image with the data inside, how are your colleagues going to use it with their data?
It does not make sense to talk about the data being inside or outside the container. The data will be inside the container. The only question is how did it get there?
My recommended process is something like:
Create an image with all your scripts, required tools, dependencies, etc; but not data. For simplicity let us name this image pipeline.
Bind mount data in volumes to the container. docker container create --mount type=bind,source=/path/to/data/files/on/host,target=/srv/data,readonly=true pipeline
Of course, replace /path/to/data/files/on/host with the appropriate path. You can store your data in one place and your colleagues in another. You make a substitution appropriate for you and they will have to make a substitution appropriate for them.
However inside the container, the data will be at /srv/data. Your scripts can just assume that it will be there.
To handle the described scenario I would recommend files to exchange data between your processing steps. To bring the files into your container you could mount a local directory into your container. That also enables some kind of persistence for your containers. The way how to mount local file system into your container is displayed in the following example.
version: '3.2'
services:
container1:
image: "your.image1"
volumes:
- "./localpath:/container/internal"
container2:
image: "your.image2"
volumes:
- "./localpath:/container/internal"
container3:
image: "your.image3"
volumes:
- "./localpath:/container/internal"
The example uses a docker compose file to describe the dependencies between your containers. You can implement the same without docker-compose. Then you have to specify your container mounts in your docker run command.
https://docs.docker.com/engine/reference/commandline/run/

Docker swarm having some shared volume

I will try to describe my desired functionality:
I'm running docker swarm over docker-compose
In the docker-compose, I've services,for simplicity lets call it A ,B ,C.
Assume C service that include shared code modules need to be accessible for services A and B.
My questions are:
1. Should each service that need access to the shared volume must mount the C service to its own local folder,(using the volumes section as below) or can it be accessible without mounting/coping to a path in local container.
In docker swarm, it can be that 2 instances of Services A and B will reside in computer X, while Service C will reside on computer Y.
Is it true that because the services are all maintained under the same docker swarm stack, they will communicate without problem with service C.
If not which definitions should it have to acheive it?
My structure is something like that:
version: "3.4"
services:
A:
build: .
volumes:
- C:/usr/src/C
depends_on:
- C
B:
build: .
volumes:
- C:/usr/src/C
depends_on:
- C
C:
image: repository.com/C:1.0.0
volumes:
- C:/shared_code
volumes:
C:
If what you’re sharing is code, you should build it into the actual Docker images, and not try to use a volume for this.
You’re going to encounter two big problems. One is getting a volume correctly shared in a multi-host installation. The second is a longer-term issue: what are you going to do if the shared code changes? You can’t just redeploy the C module with the shared code, because the volume that holds the code already exists; you need to separately update the code in the volume, restart the dependent services, and hope they both work. Actually baking the code into the images makes it possible to test the complete setup before you try to deploy it.
Sharing code is an anti-pattern in a distributed model like Swarm. Like David says, you'll need that code in the image builds, even if there's duplicate data. There are lots of ways to have images built on top of others to limit the duplicate data.
If you still need to share data between containers in swarm on a file system, you'll need to look at some shared storage like AWS EFS (multi-node read/write) plus REX-Ray to get your data to the right containers.
Also, depends_on doesn't work in swarm. Your apps in a distributed system need to handle the lack of connection to other services in a predicable way. Maybe they just exit (and swarm will re-create them) or go into a retry loop in code, etc. depends_on is mean for local docker-compose cli in development where you want to spin up a app and its dependencies by doing something like docker-compose up api.

Scaling Docker containers in Rancher with different but persistent volumes

I'm currently trying to bridge the gap between persistent, but unique volumes while scaling containers with Rancher (alternatively Docker Compose, since this is more of an abstract question).
Take as an example a Minecraft server, I have a Service defined in Rancher/Compose which uses a named volume as its data/world directory (e.g. -v minecraft_data:/data where the Minecraft image loads its world files from this /data directory). The reason I'm using such a named volume, is that I want it to persist between service upgrades (e.g. I'm changing the image version, or want to change some environment variables), which would not be possible with an anonymous volume.
Now when trying to scale up my service, I'm either getting multiple containers accessing the same data (not good for many use cases), or losing the service upgradeability when using anonymous volumes.
Are there any tools, best practices or patterns that might help with this issue?
In current versions of rancher (v1.4 at this time) storage drivers can be plugged in at the environment infrastructure level. This allows you to create volumes that are scoped at the environment, stack, or container.
For your use case, it sounds like per-container scope is what you need. Using rancher-compose you do something like:
version: '2'
services:
foo:
image: busybox
volumes:
- bar:/var/lib/storage
command: /bin/sh -c 'while true; do sleep 500; done'
volumes:
bar:
per_container: true
Then, rancher-compose up -d will create the stack and service with one container and a unique volume. rancher scale foo=2 will create another container with its own volume, etc. You can also specify volume storage drivers for each volume like rancher-ebs or rancher-nfs with their respective options.
I think what you want is to have difference instances of the entire project. scale implies identical clones, but if they have different data, they are not identical.
Instead of using scale, I would start different instances with different project names: https://docs.docker.com/compose/overview/#multiple-isolated-environments-on-a-single-host

Resources