Docker cannot change Status - docker

Hi is there any way to change docker status from Ready to Down?
I would like to change the status so as to remove the docker from the docker swarm.

Best way you can approach removing a node from docker swarm is to set the availability status of the chosen node to drain. Drain will prevent new tasks to be scheduled to the node and at the same time it will start to move all already scheduled/running tasks to other nodes. When this action is successful you will have a node in "maintenance/drain" mode which will be empty of all tasks.
You can achieve so with docker node update --availability drain link
After node has been successfully drained you can simply remove it with docker node rm command link

Related

While draining a node during a Swarm update, how do you avoid a newly active node to receive all the rescheduled containers?

During an update (in place in this case) of our Swarm, we have to drain a node, update it, make it active again, drain the following node, etc...
It works perfectly for the first node as the load of the containers to reschedule is spread quite fairly to all the remaining nodes but things get difficult when draining the second node as all the containers to reschedule go the recently updated node that has (almost) no task running.
The load when starting up all the services is huge compared to normal business, the node cannot keep up and some containers might fail to start due to healthcheck constraints and max_attempts policy.
Do you know of a way to reschedule and avoid that spike and unwanted results ? (priority, wait time, update strategy...) ?
Cheers,
Thomas
This will need to be a manual process. You can pause the scheduling on the node to go down, and then gradually stop containers on that node so they migrate slowly to other nodes in the swarm cluster. E.g.
# on manager
docker node update --availability=pause node-to-stop
# on paused node
docker container ls --filter label=com.docker.swarm.task -q \
| while read cid; do
echo "stopping $cid"
docker stop ${cid}
echo "pausing"
sleep 60
done
Adjust the sleep command as appropriate for your environment.

Docker Swarm - Health check on a swarm node

Is it possible for a swarm node to monitor itself, and set itself to drain under certain conditions? Much like HEALTHCHECK from Dockerfile, I'd like to specify the script that determines the node's health condition.
[Edit] For instance, this just started occurring today:
$ sudo docker run --rm hello-world
docker: Error response from daemon: failed to update the store state of sandbox:
failed to update store for object type *libnetwork.sbState: invalid character 'H'
looking for beginning of value.
I know how to fix this particular problem, but the node still reported Ready and Active, and was accepting tasks it could not run. A health check would have been able to determine the node could not run containers, and disable the node.
How else can you achieve a self-healing infrastructure?

Is it possible to remove a task in docker (docker swarm)?

Suppose I had 3 replicated images:
docker service create --name redis-replica --replicas=3 redis:3.0.6
Consider that there are two nodes connected (including the manager), and running the command docker service ps redis-replica yields this:
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
x1uumhz9or71 redis-replica.1 redis:3.0.6 worker-vm Running Running 9 minutes ago
j4xk9inr2mms redis-replica.2 redis:3.0.6 manager-vm Running Running 8 minutes ago
ud4dxbxzjsx4 redis-replica.3 redis:3.0.6 worker-vm Running Running 9 minutes ago
As you can see all tasks are running.
I have a scenario I want to fix:
Suppose I want to remove a redis container on the worker-vm. Currently there are two, but I want to make it one.
I could do this by going into the worker-vm, removing the container by docker rm. This poses a problem however:
Once docker swarm sees that one of the tasks has gone down, it will immediately spit out another redis image on another node (manager or worker). As a result I will always have 3 tasks.
This is not what I want. Suppose I want to force docker to not relight another image if it is removed.
Is this currently possible?
In Swarm mode, it is the orchestrator who's scheduling the tasks for you. Task is the unit of scheduling and each task invokes exactly one container.
What this means in practice is, that you are not supposed to manage tasks manually. Swarm takes care of this for you.
You need to describe the desired state of your service, if you have placement preferences you can use the --placement-pref, in docker service commands. You can specify number of replicas, etc. E.g.
docker service create \
--replicas 9 \
--name redis_2 \
--placement-pref 'spread=node.labels.datacenter' \
redis:3.0.6
You can limit the set of nodes where the task will be placed using placement constraints (https://docs.docker.com/engine/reference/commandline/service_create/#specify-service-constraints---constraint. Here is an example from the Docker docs:
$ docker service create \
--name redis_2 \
--constraint 'node.labels.type == queue' \
redis:3.0.6
I think that's the closest solution to control tasks.
Once you described your placement constraints/preferences, Swarm will make sure that the actual state of your service is in line with the desired state that you described in the create command. You are not supposed to manage any further details after describing the desired state.
If you change the actual state by killing a container for example, Swarm will re-align the state of your service to be in-line with your desired state again. This is what happened when you removed your container.
In order to change the desired state you can use the docker service update command.
The key point is that tasks are not equal to containers. Technically they invoke exactly one container, but they are not equal. Task is like a scheduling slot where the scheduler places a container.
The Swarm scheduler manages tasks (not you), that's why there is no command like docker task. You drive the mechanism by describing the desired state.
To answer your original question, yes it is possible to remove a task, you can do it by updating the desired state of your service.

docker swarm services stuck in preparing

I have a swarm stack deployed and I removed couple services from the stack and tried to deploy them again. these services are showing with desired state remove and current state preparing .. also their name got changed from the custom service name to a random docker name. swarm also trying to start these services which are also stuck in preparing. I ran docker system prune on all nodes and them removed the stack. all the services in the stack are not existent anymore except for the random ones. now I cant delete them and they still in preparing state. the services are not running anywhere in the swarm but I want to know if there is a way to remove them.
I had the same problem. Later I found that the current state, 'Preparing' indicates that docker is trying to pull images from docker hub. But there is no clear indicator in docker service logs <serviceName> available in the docker-compose-version above '3.1'.
But it sometimes imposes the latency due to n\w bandwidth or other docker internal reasons.
Hope it helps! I will update the answer if I find more relevant information.
P.S. I identified that docker stack deploy -c <your-compose-file> <appGroupName> is not stuck when switching the command to docker-compose up. For me, it took 20+ minutes to download my image for some reasons.
So, it proves that there is no open issues with docker stack deploy,
Adding reference from Christian to club and complete this answer.
Use docker-machine ssh to connect to a particular machine:
docker-machine ssh <nameOfNode/Machine>
Your prompt will change. You are now inside another machine. Inside this other machine do this:
tail -f /var/log/docker.log
You'll see the "daemon" log for that machine. There you'll see if that particular daemon is doing the "pull" or what's is doing as part of the service preparation. In my case, I found something like this:
time="2016-09-05T19:04:07.881790998Z" level=debug msg="pull progress map[progress:[===========================================> ] 112.4 MB/130.2 MB status:Downloading
Which made me realise that it was just downloading some images from my docker account.

How to remove an image across all nodes in a Docker swarm?

On the local host, I can remove an image using either docker image rm or docker rmi.
What if my current host is a manager node in a Docker swarm and I wish to cascade this operation throughout the swarm?
When I first created the Docker service, the image was pulled down on each node in the swarm. Removing the service did not remove the image and all nodes retain a copy of the image.
It feels natural that if there's a way to "push" an image out to all the nodes then there should be an equally natural way to remove them too without having to SSH into every single machine :'( Plus, this is a real problem. Sooner or later the nodes are bound to have no more disk space!
AFAIK there is no such option as of now. Each node is responsible of its own cleanup. There is a command docker system prune -f that you can use to clear container data.
But tagged images can be deleted using docker rmi only. See below issues
https://github.com/moby/moby/issues/24079
This is doable. Create host entries in /etc/hosts on your manager node, like this
1.1.1.1 node01
1.1.1.2 node02
1.1.1.3 node03
Then run
for i in {01..03}; do ssh host$i "docker rmi $(docker images -q)"; done
Warning: this command will remove all images on all nodes, listed in /etc/hosts.

Resources