DependsOn vs links - docker-compose vs ECS task - docker

DependsOn
property of ECS container definition is used for container dependencies
Links property of docker compose provides service dependencies.
We are mapping a docker compose file to ECS task definition.
Conceptually, Is the purpose of links property in docker compose similar to DependsOn property of ECS container definition?

links: was an important part of the first-generation Docker networking setup. Once Docker introduced the docker network series of commands and Docker Compose set up a private network by default, it became much less important, and there's not really any reason to use it at all in modern Docker.
Compose has its own depends_on: option. If service a depends_on: [b], then when a starts up (maybe because you explicitly docker-compose up a, or maybe just as an ordering constraint) the b container is guaranteed to exist. If b is a database or some other service that takes a while to start up, it's not guaranteed to be functional, but for instance b will be a valid host name from a's point of view.
Within a single ECS task, one container can dependsOn others. This is similar to the Compose depends_on: setting, but it has an additional condition parameter that can support a couple of different lifecycles. Of note, one container can wait for another to be "condition": "HEALTHY", a check that in Docker Compose requires the waiting container to manually check on its own (ofter with a helper script like wait-for-it.sh); it can also wait for another container to "condition": "COMPLETE" if one container just does setup for another.
If you're porting a Docker Compose file to an ECS task, I'd start by trying to replace links: with depends_on:, which shouldn't cause much functional change; translating this to ECS, the semantics of that are very similar to "dependsOn": [{"condition": "START"}].

Related

github workflow: "ECONNREFUSED 127.0.0.1:***" error when connecting to docker container

In my github actions workflow I am getting this error (ECONNREFUSED) while running my jest test script. The test uses axios to connect to my api which is running in a container bootstrapped via docker-compose (created during the github workflow itself). That network has just has 2 containers: the api, and postgres. So my test script is, I am assume, on the "host network" (github workflow), but it couldn't reach the docker network via the containers' mapped ports.
I then skipped jest test entirely and just tried to directly ping the containers. Didn't work.
I then modified the workflow to inspect the default docker network that should have been created:
UPDATE 1
I've narrowed down the issue as follows. When I modified the compose file to rely on the default network (i no longer have a networks: in my compose file):
So it looks as though the containers were never attached to the default bridge network.
UPDATE 2
It looks like I just have the wrong paradigm. After reading this: https://help.github.com/en/actions/configuring-and-managing-workflows/about-service-containers
I realise this is not how GA expects us to instantiate containers at all. Looks like I should be using services: nodes inside the workflow file, not using containers from my own docker-compose files. 🤔 Gonna try that...
So the answer is:
do not use docker-compose to build your own custom containers. GA does not support this yet.
Use services: in your workflow .yml file to launch your containers, which must be public docker images. If your container is based on a private image or custom dockerfile, it's not supported yet by GA.
So instead of "docker-compose up" to bootstrap postgres + my api for integration testing, I had to:
Create postgres as a service container in my github workflow .yml
Change my test command in package.json to:
first start the api as background process (because I can't create my own docker image from it 🙄) then
invoke my test framework next (as the foreground process)
so npm run start & npm run <test launch cmds>. This worked.
There are several possibilities here.
Host Network
Since you are using docker compose, when you start the api container, publish the endpoint that the api is listening on to the host machine. You can achieve this by doing:
version: 3
services:
api:
...
ports:
- "3010:3010"
in your docker-compose.yml. This will publish the ports, similar to doing docker run ... ---publish localhost:3010:3010. See reference here: https://docs.docker.com/compose/compose-file/#ports
Compose network
By default, docker-compose will create a network called backend-net_default. Containers created by this docker-compose.yml will have access to other containers via this network. The host name to access other containers on the network is simply the name of the service. For example, your tests could access the api endpoint using the host api (assuming that is the name of your api service), e.g.:
http://api:3010
The one caveat here is that the tests must be launched in a container that is managed by that same docker-compose.yml, so that it may access the common backend-net_default network.

Containers with pipelines: should/can you keep your data separate from the container

I am very new to containers and I was wondering if there is a "best practice" for the following situation:
Let's say I have developed a general pipeline using multiple software tools to analyze next generation sequencing data (I work in science). I decided to make a container for this pipeline so I can share it easily with colleagues. The container would have the required tools and their dependencies installed, as well as all the scripts to run the pipeline. There would be some wrapper/master script to run the whole pipeline, something like: bash run-pipeline.sh -i input data.txt
My question is: if you are using a container for this purpose, do you need to place your data INSIDE the container OR can you run the pipeline one your data which is place outside your container? In other words, do you have to place your input data inside the container to then run the pipeline on it?
I'm struggling to find a case example.
Thanks.
To me the answer is obvious - the data belongs outside the image.
The reason is that if you build an image with the data inside, how are your colleagues going to use it with their data?
It does not make sense to talk about the data being inside or outside the container. The data will be inside the container. The only question is how did it get there?
My recommended process is something like:
Create an image with all your scripts, required tools, dependencies, etc; but not data. For simplicity let us name this image pipeline.
Bind mount data in volumes to the container. docker container create --mount type=bind,source=/path/to/data/files/on/host,target=/srv/data,readonly=true pipeline
Of course, replace /path/to/data/files/on/host with the appropriate path. You can store your data in one place and your colleagues in another. You make a substitution appropriate for you and they will have to make a substitution appropriate for them.
However inside the container, the data will be at /srv/data. Your scripts can just assume that it will be there.
To handle the described scenario I would recommend files to exchange data between your processing steps. To bring the files into your container you could mount a local directory into your container. That also enables some kind of persistence for your containers. The way how to mount local file system into your container is displayed in the following example.
version: '3.2'
services:
container1:
image: "your.image1"
volumes:
- "./localpath:/container/internal"
container2:
image: "your.image2"
volumes:
- "./localpath:/container/internal"
container3:
image: "your.image3"
volumes:
- "./localpath:/container/internal"
The example uses a docker compose file to describe the dependencies between your containers. You can implement the same without docker-compose. Then you have to specify your container mounts in your docker run command.
https://docs.docker.com/engine/reference/commandline/run/

Docker swarm having some shared volume

I will try to describe my desired functionality:
I'm running docker swarm over docker-compose
In the docker-compose, I've services,for simplicity lets call it A ,B ,C.
Assume C service that include shared code modules need to be accessible for services A and B.
My questions are:
1. Should each service that need access to the shared volume must mount the C service to its own local folder,(using the volumes section as below) or can it be accessible without mounting/coping to a path in local container.
In docker swarm, it can be that 2 instances of Services A and B will reside in computer X, while Service C will reside on computer Y.
Is it true that because the services are all maintained under the same docker swarm stack, they will communicate without problem with service C.
If not which definitions should it have to acheive it?
My structure is something like that:
version: "3.4"
services:
A:
build: .
volumes:
- C:/usr/src/C
depends_on:
- C
B:
build: .
volumes:
- C:/usr/src/C
depends_on:
- C
C:
image: repository.com/C:1.0.0
volumes:
- C:/shared_code
volumes:
C:
If what you’re sharing is code, you should build it into the actual Docker images, and not try to use a volume for this.
You’re going to encounter two big problems. One is getting a volume correctly shared in a multi-host installation. The second is a longer-term issue: what are you going to do if the shared code changes? You can’t just redeploy the C module with the shared code, because the volume that holds the code already exists; you need to separately update the code in the volume, restart the dependent services, and hope they both work. Actually baking the code into the images makes it possible to test the complete setup before you try to deploy it.
Sharing code is an anti-pattern in a distributed model like Swarm. Like David says, you'll need that code in the image builds, even if there's duplicate data. There are lots of ways to have images built on top of others to limit the duplicate data.
If you still need to share data between containers in swarm on a file system, you'll need to look at some shared storage like AWS EFS (multi-node read/write) plus REX-Ray to get your data to the right containers.
Also, depends_on doesn't work in swarm. Your apps in a distributed system need to handle the lack of connection to other services in a predicable way. Maybe they just exit (and swarm will re-create them) or go into a retry loop in code, etc. depends_on is mean for local docker-compose cli in development where you want to spin up a app and its dependencies by doing something like docker-compose up api.

What is the difference between docker and docker-compose

docker and docker-compose seem to be interacting with the same dockerFile, what is the difference between the two tools?
The docker cli is used when managing individual containers on a docker engine. It is the client command line to access the docker daemon api.
The docker-compose cli can be used to manage a multi-container application. It also moves many of the options you would enter on the docker run cli into the docker-compose.yml file for easier reuse. It works as a front end "script" on top of the same docker api used by docker, so you can do everything docker-compose does with docker commands and a lot of shell scripting. See this documentation on docker-compose for more details.
Update for Swarm Mode
Since this answer was posted, docker has added a second use of docker-compose.yml files. Starting with the version 3 yml format and docker 1.13, you can use the yml with docker-compose and also to define a stack in docker's swarm mode. To do the latter you need to use docker stack deploy -c docker-compose.yml $stack_name instead of docker-compose up and then manage the stack with docker commands instead of docker-compose commands. The mapping is a one for one between the two uses:
Compose Project -> Swarm Stack: A group of services for a specific purpose
Compose Service -> Swarm Service: One image and it's configuration, possibly scaled up.
Compose Container -> Swarm Task: A single container in a service
For more details on swarm mode, see docker's swarm mode documentation.
docker manages single containers
docker-compose manages multiple container applications
Usage of docker-compose requires 3 steps:
Define the app environment with a Dockerfile
Define the app services in docker-compose.yml
Run docker-compose up to start and run app
Below is a docker-compose.yml example taken from the docker docs:
services:
web:
build: .
ports:
- "5000:5000"
volumes:
- .:/code
- logvolume01:/var/log
links:
- redis
redis:
image: redis
volumes:
logvolume01: {}
A Dockerfile is a text document that contains all the commands/Instruction a user could call on the command line to assemble an image.
Docker Compose is a tool for defining and running multi-container Docker applications. With Compose, you use a YAML file to configure your application’s services. Then, with a single command, you create and start all the services from your configuration. By default, docker-compose expects the name of the Compose file as docker-compose.yml or docker-compose.yaml. If the compose file has a different name we can specify it with -f flag.
Check here for more details
docker or more specifically docker engine is used when we want to handle only one container whereas the docker-compose is used when we have multiple containers to handle. We would need multiple containers when we have more than one service to be taken care of, like we have an application that has a client server model. We need a container for the server model and one more container for the client model. Docker compose usually requires each container to have its own dockerfile and then a yml file that incorporates all the containers.

Managing a group of docker containers without the sweat

I am using a bash script to spin up a virtual network with two docker containers on it. This feels prehistoric. Is there some tool that can spin such an ensemble up and down & show its current status, or does one have to take care of that on their own?
In case docker-compose, it is unclear from docker documentation whether docker-compose is self-contained or tied to swarm, and an authoritative example of a compose definition file, with commands for starting and stopping the ensemble would be very helpful.
E.g. here is what a bash script would do to define/start an application of two interrelated containers, needless to say this script does not help with managing its lifecycle beyond just starting it up once.
docker network create --driver bridge FooAppNet
docker run --rm --net=FooAppNet --name=component1 -p 9000:9000 component1-image
docker run --rm --net=FooAppNet --name=component2 component2-image
Also in this example, container component1 exposes port 9000 to the host, and its contained application has it hardwired in its configuration file, to consume the service of component2 by its name (following the common docker networking practice relying on docker networks' internal DNS).
For the example you've given, the following Docker Compose file would give you what you want:
component1:
image: component1-image
net: FooAppNet
container_name: component1
ports:
- "9000:9000"
component2:
image: component2-image
net: FooAppNet
container_name: component2
If you store this in a docker-compose.yml file and then run docker-compose up -d it will create/start/restart your containers and assign them to your FooAppNet network.
The -d flag runs the containers in detached mode and prevents the logging output being printed to your terminal window when you start the containers. You can still get their log via docker logs -f ... like with any other container.
You can then use docker-compose down and docker-compose restart etc to control the ensemble's lifecycle. As an aside, using variables can spice up the definition file towards greater flexibility.
See in the comments below about using the network automatically spun up by docker compose.
TL;DR ― see the beginning section of https://docs.docker.com/compose/networking/ for the solution. It walks you through the entire necessary configuration. Works nicely, and need to master the various docker-compose command-line options to be productive with it.

Resources