Is it possible for a swarm node to monitor itself, and set itself to drain under certain conditions? Much like HEALTHCHECK from Dockerfile, I'd like to specify the script that determines the node's health condition.
[Edit] For instance, this just started occurring today:
$ sudo docker run --rm hello-world
docker: Error response from daemon: failed to update the store state of sandbox:
failed to update store for object type *libnetwork.sbState: invalid character 'H'
looking for beginning of value.
I know how to fix this particular problem, but the node still reported Ready and Active, and was accepting tasks it could not run. A health check would have been able to determine the node could not run containers, and disable the node.
How else can you achieve a self-healing infrastructure?
Related
Hi is there any way to change docker status from Ready to Down?
I would like to change the status so as to remove the docker from the docker swarm.
Best way you can approach removing a node from docker swarm is to set the availability status of the chosen node to drain. Drain will prevent new tasks to be scheduled to the node and at the same time it will start to move all already scheduled/running tasks to other nodes. When this action is successful you will have a node in "maintenance/drain" mode which will be empty of all tasks.
You can achieve so with docker node update --availability drain link
After node has been successfully drained you can simply remove it with docker node rm command link
Note: I've tried searching for existing answers in any way I could think of, but I don't believe there's any information out there on how to achieve what I'm after
Context
I have an existing swarm running a bunch of networked services across multiple hosts. The deployment is done via docker-compose build && docker stack deploy. Several of the services contain important state necessary for the functioning of the main service this stack is for, including when interacting with it via CLI.
Goal
How can I create an ad-hoc container within the existing stack running on my swarm for interactive diagnostics and troubleshooting of my main service? The service has a CLI interface, but it needs access to the other components for that CLI to function, thus it needs to be run exactly as if it were a service declared inside docker-compose.yml. Requirements:
I need to run it in an ad-hoc fashion. This is for troubleshooting by an operator, so I don't know when exactly I'll need it
It needs to be interactive, since it's troubleshooting by a human
It needs to be able to run an arbitrary image (usually the image built for the main service and its CLI, but sometimes other diagnostics might be needed through other containers I won't know ahead of time)
It needs to have full access to the network and other resources set up for the stack, as if it were a regular predefined service in it
So far the best I've been able to do is:
Find an existing container running my service's image
SSH into the swarm host on which it's running
docker exec -ti into it to invoke the CLI
This however has a number of downsides:
I don't want to be messing with an already running container, it has an important job I don't want to accidentally interrupt, plus its state might be unrelated to what I need to do and I don't want to corrupt it
It relies on the service image also having the CLI installed. If I want to separate the two, I'm out of luck
It relies on some containers already running. If my service is entirely down and in a restart loop, I'm completely hosed because there's nowhere for me to exec in and run my CLI
I can only exec within the context of what I already have declared and running. If I need something I haven't thought to add beforehand, I'm sadly out of luck
Finding the specific host on which the container is running and going there manually is really annoying
What I really want is a version of docker run I could point to the stack and say "run in there", or docker stack run, but I haven't been able to find anything of the sort. What's the proper way of doing that?
Option 1
deploy a diagnostic service as part of the stack - a container with useful tools in it, with an entrypoint of tail -f /dev/null - use a placement contraint to deploy this to a known node.
services:
diagnostics:
image: nicolaka/netshoot
command: tail -f /dev/null
deploy:
placement:
constraints:
- node.hostname == host1
NB. You do NOT have to deploy this service with your normal stack. It can be in a separate stack.yml file. You can simply stack deploy this file to your stack later, and as long as --prune is not used, the services are cumulative.
Option 2
To allow regular containers to access your services - make your network attachable. If you havn't specified the network explicitly you can just explicitly declare the default network.
networks:
default:
driver: overlay
attachable: true
Now you can use docker run and attach to the network with a diagnostic container :-
docker -c manager run --rm --network <stack>_default -it nicolaka/netshoot
Option 3
The third option does not address the need to directly access the node running the service, and it does not address the need to have an instance of the service running, but it does allow you to investigate a service without effecting its state and without needing tooling in the container.
Start by executing the usual commands to discover the node and container name and id of the service task of interest:
docker service ps ${service} --no-trunc --format '{{.Node}} {{.Name}}.{{.ID}}' --filter desired-state=running
Then, assuming you have docker contexts to match your node names: - pick one ${node}, ${container} from the list of {{.Node}}, {{.Name}}.{{.ID}} and run a container such as ubuntu or netshoot, attaching it to the network namespace of the target container.
docker -c ${node} run --rm -it --network container:${container} nicolaka/netshoot
This container can be used to perform diagnostics in the context of the running service task, and then closed without affecting it.
Suppose I had 3 replicated images:
docker service create --name redis-replica --replicas=3 redis:3.0.6
Consider that there are two nodes connected (including the manager), and running the command docker service ps redis-replica yields this:
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
x1uumhz9or71 redis-replica.1 redis:3.0.6 worker-vm Running Running 9 minutes ago
j4xk9inr2mms redis-replica.2 redis:3.0.6 manager-vm Running Running 8 minutes ago
ud4dxbxzjsx4 redis-replica.3 redis:3.0.6 worker-vm Running Running 9 minutes ago
As you can see all tasks are running.
I have a scenario I want to fix:
Suppose I want to remove a redis container on the worker-vm. Currently there are two, but I want to make it one.
I could do this by going into the worker-vm, removing the container by docker rm. This poses a problem however:
Once docker swarm sees that one of the tasks has gone down, it will immediately spit out another redis image on another node (manager or worker). As a result I will always have 3 tasks.
This is not what I want. Suppose I want to force docker to not relight another image if it is removed.
Is this currently possible?
In Swarm mode, it is the orchestrator who's scheduling the tasks for you. Task is the unit of scheduling and each task invokes exactly one container.
What this means in practice is, that you are not supposed to manage tasks manually. Swarm takes care of this for you.
You need to describe the desired state of your service, if you have placement preferences you can use the --placement-pref, in docker service commands. You can specify number of replicas, etc. E.g.
docker service create \
--replicas 9 \
--name redis_2 \
--placement-pref 'spread=node.labels.datacenter' \
redis:3.0.6
You can limit the set of nodes where the task will be placed using placement constraints (https://docs.docker.com/engine/reference/commandline/service_create/#specify-service-constraints---constraint. Here is an example from the Docker docs:
$ docker service create \
--name redis_2 \
--constraint 'node.labels.type == queue' \
redis:3.0.6
I think that's the closest solution to control tasks.
Once you described your placement constraints/preferences, Swarm will make sure that the actual state of your service is in line with the desired state that you described in the create command. You are not supposed to manage any further details after describing the desired state.
If you change the actual state by killing a container for example, Swarm will re-align the state of your service to be in-line with your desired state again. This is what happened when you removed your container.
In order to change the desired state you can use the docker service update command.
The key point is that tasks are not equal to containers. Technically they invoke exactly one container, but they are not equal. Task is like a scheduling slot where the scheduler places a container.
The Swarm scheduler manages tasks (not you), that's why there is no command like docker task. You drive the mechanism by describing the desired state.
To answer your original question, yes it is possible to remove a task, you can do it by updating the desired state of your service.
I have a swarm stack deployed and I removed couple services from the stack and tried to deploy them again. these services are showing with desired state remove and current state preparing .. also their name got changed from the custom service name to a random docker name. swarm also trying to start these services which are also stuck in preparing. I ran docker system prune on all nodes and them removed the stack. all the services in the stack are not existent anymore except for the random ones. now I cant delete them and they still in preparing state. the services are not running anywhere in the swarm but I want to know if there is a way to remove them.
I had the same problem. Later I found that the current state, 'Preparing' indicates that docker is trying to pull images from docker hub. But there is no clear indicator in docker service logs <serviceName> available in the docker-compose-version above '3.1'.
But it sometimes imposes the latency due to n\w bandwidth or other docker internal reasons.
Hope it helps! I will update the answer if I find more relevant information.
P.S. I identified that docker stack deploy -c <your-compose-file> <appGroupName> is not stuck when switching the command to docker-compose up. For me, it took 20+ minutes to download my image for some reasons.
So, it proves that there is no open issues with docker stack deploy,
Adding reference from Christian to club and complete this answer.
Use docker-machine ssh to connect to a particular machine:
docker-machine ssh <nameOfNode/Machine>
Your prompt will change. You are now inside another machine. Inside this other machine do this:
tail -f /var/log/docker.log
You'll see the "daemon" log for that machine. There you'll see if that particular daemon is doing the "pull" or what's is doing as part of the service preparation. In my case, I found something like this:
time="2016-09-05T19:04:07.881790998Z" level=debug msg="pull progress map[progress:[===========================================> ] 112.4 MB/130.2 MB status:Downloading
Which made me realise that it was just downloading some images from my docker account.
The question title is the specific problem I am trying to solve. But even more simply, is it at all possible to list all tasks in a service together with the node IPs running them?
docker service ps will list task IDs together with the hostname of the node the task is running on. No other node identifier is provided such as ID or IP.
But I am using Vagrant to manage VMs and without a hostname configured, all host names are named the same ("vagrant"). This makes it very hard for me to figure out exactly which node is running the task!
This is important because I have to manually delete unused images or risk having the machines crash in the future when there's no more disk space. So figuring out where a task ran is the first step of this process lol.
For you Vagrant users, I changed my hostname quite easily in the Vagrantfile using the config.vm.hostname option. But of course, the question is still totally legit.
I can manually run docker images or docker ps on each node to figure out which node store the expected image and/or is currently running which container (the container name would be a concatenation of the task ID and task name, separated with a dot). But this is cumbersome.
I could also list all nodes with their IDs using docker node ls and then headhunt the task for each node, for example by using docker node ps 7b ("7b" is the first two letters in one of my node IDs). But this is cumbersome too and at best, I will "only" learn the node ID and not the IP.
But, I can find the IP using a node ID with a command like this: docker inspect 7b --format '{{.Status.Addr}}'. So getting at the IP directly is not a strict requirement and for a moment - when I understood this - I thought finding a node ID for a given task ID is going to be much easier!
But no. Even this seems to be impossible? As noted earlier, docker service ps does not give me the node ID. The docs for the command says that the placeholder .Name should give me the "Node ID" but this is wrong.
Until this moment I must have tried a billion different hacks.
One thing in particular that I find disturbing is that the docs for the docker node ps command explicitly states that it can be used to "list tasks running on one or more nodes" (emphasize mine). But if I do this: docker node ps vagrant I get an error message that the hostname is ambiguous because it resolves to more than one node! lol isn't that funny (even without using hostnames I have not gotten this command to work for listing tasks on multiple nodes). Not that it matters, because docker node ps just like docker service ps does not even output node IDs and the docs for both these commands lie about being able to do so.
So there you have it. I am left wondering if there's something right in front of me that I have missed or is the entire world relying on unique hostnames? It feels like I have to miss something obvious. Surely this oh so popular product Docker must be able to provide a way to find a node ID or node IP given a task ID? hmm.
I am running Docker version 17.06.2.
This gives me the node ID, given a <task ID>:
docker inspect <task ID> --format '{{.NodeID}}'
Use the node ID to get the node IP:
docker inspect <node ID> --format '{{.Status.Addr}}'
Or, all in one compressed line:
docker inspect -f '{{.Status.Addr}}' $(docker inspect -f '{{.NodeID}}' <task ID>)
As a bonus, the MAC address:
docker inspect -f '{{.NetworkSettings.Networks.ingress.MacAddress}}' $(docker inspect -f '{{.Status.ContainerStatus.ContainerID}}' <task ID>)