We have a docker swarm and we normally run service container using Docker create service API. Now we are seeing after certain time interval the services are not responding ( means the application running inside container ). As of now the solution looks like restarting the service after specific time interval.And it worked when we tried it manually .
This is the top command output of Host worker node
And output of docker stats
Wanted to know what is the best approach to fix it. Also can we automate the solution.
Related
I'm trying to dockerize 2 dotnet console application that one depends on the other.
When I run the 1st container I need it to run another container on the host, insert a parameter to it's stdin and wait for it to end the job and exit.
What will be the best solution to achieve this?
Running a container inside a container seems to me like a bad solution,
I've also thought about managing another process with a webserver (nginx or something) on the host that get the request as http request and execute a docker run command in the host but I'm sure there is a better solution for this (in this way the webserver will just run on the host and not inside a container).
There is also this solution but it seems to have major security issues.
I've tried also using the Docker.DotNet library but it does not help with my problem.
any ideas for the best solution?
Thanks in advance.
EDIT:
I will be using docker compose but the problem is that the 2nd container is not running and listening at all time, similar to the Hello-World container it's called, performs it's job and exits.
EDIT2: FIX
I've implemented redis as a message broker to communicate between the different services, while it changed the requirement a little (containers will always run and listen to redis) it helped me to solve the issue.
I am not sure I understand correctly what you need to do, but if you simply need to start two containers in parallel the simplest way I can think of is docker-compose.
Another way is with a python or bash/bat script that launches the containers independently (in python you can either use the Docker API or do it manually with the subprocess module). This allows you to perform other things (like writing on the stdin of one container, as you stated).
I have a bunch of docker containers running in swarm mode (services). If the whole server restarts then containers start to run one by one after server reboot. Is there a way to set an order of container creation and run?
P.S. I can't user docker-compose as these services were created dynamically through Docker Remote API.
You can try to set a shorter restart delay (with --restart-delay) to the services you want to start firstly and a bigger to next one etc..
But I am not sure that working.
After running docker stack deploy to deploy some services to swarm is there a way to programmatically test if all containers started correctly?
The purpose would be to verify in a staging CI/CD pipeline that the containers are actually running and didn't fail on startup. Restart is disabled via restart_policy.
I was looking at docker stack services, is the replicas column useful for this purpose?
$ docker stack services --format "{{.ID}} {{.Replicas}}" my-stack-name
lxoksqmag0qb 0/1
ovqqnya8ato4 0/1
Yes, there are ways to do it, but it's manual and you'd have to be pretty comfortable with docker cli. Docker does not provide an easy built-in way to verify that docker stack deploy succeeded. There is an open issue about it.
Fortunately for us, community has created a few tools that implement docker's shortcomings in this regard. Some of the most notable ones:
https://github.com/issuu/sure-deploy
https://github.com/sudo-bmitch/docker-stack-wait
https://github.com/ubirak/docker-php
Issuu, authors of sure-deploy, have a very good article describing this issue.
Typically in CI/CD I see everyone using docker or docker-compose. A container runs the same in docker as it does docker swarm with respects to "does this container work by itself as intended".
That being said, if you still wanted to do integration testing in a multi-tier solution with swarm, you could do various things in automation. Note this would all be done on a single node swarm to make testing easier (docker events doesn't pull node events from all nodes, so tracking a single node is much easier for ci/cd):
Have something monitoring docker events, e.g. docker events -f service=<service-name> to ensure containers aren't dying.
always have healthchecks in your containers. They are the #1 way to ensure your app is healthy (at the container level) and you'll see them succeed or fail in docker events. You can put them in Dockerfiles, service create commands, and stack/compose files. Here's some great examples.
You could attach another container to the same network to test your services remotely 1-by-1 using tasks. with reverse DNS. This will avoid the VIP and let you talk to a specific replica(s).
You might get some stuff out of docker inspect <service-id or task-id>
Another solution might be to use docker service scale - it will not return until service is converged to specified amount of replicas or will timeout.
export STACK=devstack # swarm stack name
export SERVICE_APP=yourservice # service name
export SCALE_APP=2 # desired amount of replicas
docker stack deploy $STACK --with-registry-auth
docker service scale ${STACK}_${SERVICE_APP}=${SCALE_APP}
One drawback of that method is that you need to provide service names and their replica counts (but these can be extracted from compose spec file using jq).
Also, in my use case I had to specify timeout by prepending timeout command, i.e. timeout 60 docker service scale, because docker service scale was waiting its own timeout even if some containers failed, which could potentially slow down continuous delivery pipelines
References
Docker CLI: docker service scale
jq - command-line JSON processor
GNU Coreutils: timeout command
you can call this for every service. it returns when converged. (all ok)
docker service update STACK_SERVICENAME
I am running a docker container which contains a node server. I want to attach to the container, kill the running server, and restart it (for development). However, when I kill the node server it kills the entire container (presumably because I am killing the process the container was started with).
Is this possible? This answer helped, but it doesn't explain how to kill the container's default process without killing the container (if possible).
If what I am trying to do isn't possible, what is the best way around this problem? Adding command: bash -c "while true; do echo 'Hit CTRL+C'; sleep 1; done" to each image in my docker-compose, as suggested in the comments of the linked answer, doesn't seem like the ideal solution, since it forces me to attach to my containers after they are up and run the command manually.
This is by design by Docker. Each container is supposed to be a stateless instance of a service. If that service is interrupted, the container is destroyed. If that service is requested/started, it is created. If you're using an orchestration platform like k8s, swarm, mesos, cattle, etc at least.
There are applications that exist to represent PID 1 rather than the service itself. But this goes against the design philosophy of microservices and containers. Here is an example of an init system that can run as PID 1 instead and allow you to kill and spawn processes within your container at will: https://github.com/Yelp/dumb-init
Why do you want to reboot the node server? To apply changes from a config file or something? If so, you're looking for a solution in the wrong direction. You should instead define a persistent volume so that when the container respawns the service would reread said config file.
https://docs.docker.com/engine/admin/volumes/volumes/
If you need to restart the process that's running the container, then simply run a:
docker restart $container_name_or_id
Exec'ing into a container shouldn't be needed for normal operations, consider that a debugging tool.
Rather than changing the script that gets run to automatically restart, I'd move that out to the docker engine so it's visible if your container is crashing:
docker run --restart=unless-stopped ...
When a container is run with the above option, docker will restart it for you, unless you intentionally run a docker stop on the container.
As for why killing pid 1 in the container shuts it down, it's the same as killing pid 1 on a linux server. If you kill init/systemd, the box will go down. Inside the namespace of the container, similar rules apply and cannot be changed.
I have a docker container that I want to deploy to a CoreOS cluster that has to download my app from a git repo.
Let's say the app container runs nginx / nodejs
How should I update it?
If i submit the container and start it, that works the first time. But the second time I'll have to stop/start the container with fleetctl then I'll obviously have downtime. Should I start up new containers that are derived from that container?
Here's a complete walkthrough on exactly such a scenario:
http://coreos.com/blog/zero-downtime-frontend-deploys-vulcand.html
Instead of pulling down your application from github inside your container, you should bake your application code inside your container/image. Your container should start its services within a few seconds. To achieve zero downtime you should keep the old container running until your new container has started and is ready to accept new connections. You could do this by separating nginx into its own container and keep it running all the time.