Best Docker Stack equivalent for docker-compose "--exit-code-from" option? - docker

I have a docker-compose file with 4 services. Services 1,2,3 are job executors. Service 4 is the job scheduler. After the scheduler has finished running all its jobs on executors, it returns 0 and terminates. However the executor services still need to be shut down. With standard docker-compose this is easy. Just use the "--exit-code-from" option:
Terminate docker compose when test container finishes
However when a version 3.0+ compose file is deployed via Docker Stack, I see no equivalent way to wait for 1 service to complete and then terminate all remaining services. https://docs.docker.com/engine/reference/commandline/stack/
A few possible approaches are discussed here -
https://github.com/moby/moby/issues/30942
The solution from miltoncs seems reasonable at first:
https://github.com/moby/moby/issues/30942#issuecomment-540699206
The concept suggested is querying every second with docker stack ps to get service status. Then removing all services with docker stack rm when done. I'm not sure how all the constant stack ps traffic would scale with thousands of jobs running in a cluster. Potentially bogging down the ingress network?
Does anyone have experience / success with this or similar solutions?

Related

Perform a docker-compose pull via Ansible

Question
With the Ansible docker_compose module is it possible to perform a docker-compose pull and/or docker-compose build without actually starting the service?
What have I tried?
I've attempted:
- name: Build & pull services
become: yes
docker_compose:
project_src: "{{ installation_path }}"
build: yes
state: present
stopped: yes
but this seems to start the services as well (even though I have stopped: yes).
Use case
The actual situation is that starting the services causes ports conflicts with existing processes. So the idea is to:
Stop the conflicting processes
Start the docker services
The problem is that one of these processes is the one that resolves DNS queries so stopping the processes and starting the docker services leads to an attempt to fetch the docker images from the docker registry, failing with a DNS resolution error.
My idea was to:
Pull every necessary image
Stop the conflicting processes
Start the docker services
According to this Github issue this is not possible and will likely remain so in the near future given that the docker_* is not actively maintained.

How to Run container from docker compose after 1 hour

I have 10 container in docker compose.
I want 9 containers to start working when i up the docker compose and allow docker-compose to run the 10th container after 1 hour of time.
Currently its running all containers at once.
How i can achieve this ?
Docker Compose doesn’t directly have this functionality. (Kubernetes doesn’t either, though it does have the ability to run a short-lived container at a specified time of day.)
Probably the best workaround to the problem as you’ve stated it is to use a tool like at(1) to run an additional container at a later time
at +1h docker run ...
My experience has generally been that it can get a little messy to depend on starting and stopping Docker containers for workflow management. You may be better off starting a pool of workers against some job queue system like RabbitMQ and injecting a job after an hour, or using a language-native scheduled-task library in your application, and just always start every container every time.

docker swarm services stuck in preparing

I have a swarm stack deployed and I removed couple services from the stack and tried to deploy them again. these services are showing with desired state remove and current state preparing .. also their name got changed from the custom service name to a random docker name. swarm also trying to start these services which are also stuck in preparing. I ran docker system prune on all nodes and them removed the stack. all the services in the stack are not existent anymore except for the random ones. now I cant delete them and they still in preparing state. the services are not running anywhere in the swarm but I want to know if there is a way to remove them.
I had the same problem. Later I found that the current state, 'Preparing' indicates that docker is trying to pull images from docker hub. But there is no clear indicator in docker service logs <serviceName> available in the docker-compose-version above '3.1'.
But it sometimes imposes the latency due to n\w bandwidth or other docker internal reasons.
Hope it helps! I will update the answer if I find more relevant information.
P.S. I identified that docker stack deploy -c <your-compose-file> <appGroupName> is not stuck when switching the command to docker-compose up. For me, it took 20+ minutes to download my image for some reasons.
So, it proves that there is no open issues with docker stack deploy,
Adding reference from Christian to club and complete this answer.
Use docker-machine ssh to connect to a particular machine:
docker-machine ssh <nameOfNode/Machine>
Your prompt will change. You are now inside another machine. Inside this other machine do this:
tail -f /var/log/docker.log
You'll see the "daemon" log for that machine. There you'll see if that particular daemon is doing the "pull" or what's is doing as part of the service preparation. In my case, I found something like this:
time="2016-09-05T19:04:07.881790998Z" level=debug msg="pull progress map[progress:[===========================================> ] 112.4 MB/130.2 MB status:Downloading
Which made me realise that it was just downloading some images from my docker account.

Docker services route network before task is actually up - zero downtime

I'm currently running Docker version 18.03.1-ce, build 9ee9f40 on multiple nodes. My setup is a nginx service and multiple java restful API services running in a wildfly cluster.
For my API services I've configured a simple healthcheck to determine whether my API task is actually up:
HEALTHCHECK --interval=5m --timeout=3s \
--retries=2 --start-period=1m \
CMD curl -f http://localhost:8080/api/healthcheck || exit 1
But even with the use of HealthCheck my nginx sometimes gets and error - caused by the fact that the API is still not fully up - can't serve rest requests.
The only solution, that I managed to get working so far is increasing the --start-period by hand to a lot longer periods.
How does the docker service load balancer decide, when to start routing requests to the new service?
Is setting a higher time with the --start-period currently the only way to prevent load balancer from redirecting traffic to a task that is not ready for traffic or am I missing something?
I've seen the "blue-green" deployment answers like this where you can manage zero downtime, but I'm still hoping this could be done with the use of docker services.
The routing mesh will start routing traffic on the "first successful healthcheck", even if future ones fail.
Whatever you put in the HEALTHCHECK command it needs to only start returning "exit 0" when things are truly ready. If it returns a good result too early, then that's not a good healthcheck command.
The --start-period only tells swarm when to kill the task if it's yet to receive a successful healthcheck in that time, but it won't cause green healthchecks to be ignored during the start period.

Can Docker Engine start containers in parallel

If I have scripts issueing docker run commands in parallel, the docker engine appears to handle these commands in series. Since runing a minimal container image with "docker run" takes around 100ms to start does this mean issueing commands in parallel to run 1000 containers will take the docker engine 100ms x 1000 = 100 s or nearly 2 minutes? Is there some reason why the docker engine is serial instead of parallel? How do people get around this?
How do people get around this?
a/ They don't start 1000 containers at the same time
b/ if they do, they might use a cluster management system like docker swarm to manage the all process
c/ they do run 1000 containers, in advance in order to take into account the starting time.
Truly parallelize docker run command could be tricky considering some of those command might depend on other containers to be created/started first (like a docker run --volumes-from=xxx)

Resources