How to increase jvm heap size of a docker service? - docker

We are using docker swarm on the server for orchestration. using openjdk8. My backend application is a rest service named "api". On the master, if do :
docker service ls
see the result :
ID NAME MODE REPLICAS IMAGE PORTS
7l89205dje61 integration_api replicated 1/1 docker.repo1.tomba.com/koppu/koppu-api:3.1.2.96019dc
.................
Time to time I am seeing an error in this docker service log (docker service logs integration_api):
java.lang.OutOfMemoryError: Java heap space
And hence, I am trying to update jvm heap size for this docker service. so I tried :
docker service update --env-add JAVA_OPTS="-Xms3G -Xmx3G -XX:MaxPermSize=1024m" integration_api
Saw this result:
integration_api
overall progress: 1 out of 1 tasks
1/1: running [==================================================>]
However, it does not actually seem to increase the heap in the container. What should I do differently?

Related

How to check heap size inside docker container

We are using docker swarm on the server. using openjdk8. If do :
docker service ls
see the result :
ID NAME MODE REPLICAS IMAGE PORTS
7l89205dje61 integration_api replicated 1/1 docker.repo1.tomba.com/koppu/koppu-api:3.1.2.96019dc
.................
I am trying to update jvm heap size for this service so I tried :
docker service update --env-add JAVA_OPTS="-Xms3G -Xmx3G -XX:MaxPermSize=1024m" integration_api
Saw this result:
integration_api
overall progress: 1 out of 1 tasks
1/1: running [==================================================>]
Now I am trying to see the heap size and not finding a way as when tried to get inside the container taking the id above as :
docker exec -it 7l89205dje61 bash
getting error :
this container does not exit.
Any suggestion?
Perhaps you can exec into the running container and display the current heap size with something like this?
# get the name of a container within your service with
docker exec -it <CONTAINER-ID> bash
# after execing into the container,
java -XX:+PrintFlagsFinal -version | grep HeapSize
Use this Stack Orerflow post to figure out how to exec into a service
Got the java code to print heap settings from this Stack Overflow post
Note these ideas don't have a good example out there as yet to my knowledge. However, one good way of doing it is to implement a "healthcheck" process that would query the JVM statistics like heap and other things and report it to another system.
Another way is exposing the Spring Boot Actuator API so that Prometheus can read and track it over time.

How to see memory allotment for Docker Engine?

Setting up a docker instance of Elasticsearch Cluster.
In the instructions it says
Make sure Docker Engine is allotted at least 4GiB of memory
I am ssh'ing to the host, not using docker desktop.
How can I see the resource allotments from the command line?
reference URL
https://www.elastic.co/guide/en/elastic-stack-get-started/current/get-started-docker.html
I had same problem, with Docker Desktop on Windows 10 while running Linux containers on WSL2.
I found this issue: https://github.com/elastic/elasticsearch-docker/issues/92 and tried to apply similar logic to the solution of there.
I entered the WSL instance's terminal by
wsl -d docker-desktop command.
Later I run sysctl -w vm.max_map_count=262144 command to set 'allotted memory'.
After these steps I could run elasticsearch's docker compose example.
I'd like to go about it by just using one command.
docker stats -all
This will give a output such as following
$ docker stats -all
CONTAINER ID NAME CPU% MEM USAGE/LIMIT MEM% NET I/O BLOCK I/O PIDS
5f8a1e2c08ac my-compose_my-nginx_1 0.00% 2.25MiB/1.934GiB 0.11% 1.65kB/0B 7.35MB/0B 2
To modify the limits :
when you're making your docker-compose.yml include the following at the end of your file. (if you'd like to set up a 4 GiB limit)
resources:
limits:
memory: 4048m
reservations:
memory: 4048m

How to run docker service with volume with multiple replicas

In my docker swarm cluster, I am running nexus3 repository as docker image registry. This repository is a critical component of my devops infrastructure, because we have many jenkins instances running in the swarm thar start all their build in separate build agent containers. When my nexus service is down, my jenkins instances can not pull the images for the build agent containers and so they are not able to start builds anymore, if nexus crashes (for example because the cluster node which is running nexus crashes or is rebooted).
Yesterday we had the additional problem, that the node, nexus was running on, was the only node that had a local copy of the image for nexus itself. So no other host could launch my nexus service. I had to rebuild my image from the GIT repository with the Dockerfile on another node and then launch nexus there. All in all, this took about half an hour in which we were not able do start any build job.
So I tried to start nexus in replicated mode with two replicas, like this:
deploy:
mode: replicated
replicas: 2
My nexus service is using a volume like this:
services:
nexus:
volumes:
- sonatype-work:/opt/sonatype/sonatype-work
[...]
volumes:
sonatype-work:
driver: local
driver_opts:
o: bind
device: /mnt/docker-data/services/nexus3/sonatype-work
type: none
When I redeploy my nexus stack, one instance is starting and the second one always starts and then exits without error (completed). See docker service ps for my nexus service:
docker#master:/mnt/docker-data/services/nexus3$ docker service ps nexus_nexus
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
oxez5kl866ma nexus_nexus.1 localhost:5001/devops/nexus3:0.0.1 node1 Running Running 2 minutes ago
u0s6slqlj0uf nexus_nexus.2 localhost:5001/devops/nexus3:0.0.1 node4 Ready Ready 1 second ago
d1u1btefzf1s \_ nexus_nexus.2 localhost:5001/devops/nexus3:0.0.1 node4 Shutdown Complete 1 second ago
ythgbtrmycon \_ nexus_nexus.2 localhost:5001/devops/nexus3:0.0.1 node3 Shutdown Complete 8 seconds ago
The log (docker service logs -f nexus_nexus) gives no information why my second instance does not stay running but always completes and restarts.
Is there maybe an I/O conflict, because both instances try to use a volume on the node they are deployed on that points to the same host directory (see device in my volume definition)?
Or does someone have another idea what is going wrong here?

Docker container stats empty after exit

I am trying to measure total cpu_usage for the docker container using GET /containers/$CONTAINER_ID/stats API and while the container is running the endpoint returns meaningful data:
{"cpu_usage":{"total_usage":213788430,"percpu_usage":[1917539,288587,145139232,1911156,5966276,4931989,11474144,1252778,593573,847283,39465873,0],"usage_in_kernelmode":20000000,"usage_in_usermode":160000000}
but once it exits everything is zeroed out:
{"cpu_usage":{"total_usage":0,"usage_in_kernelmode":0,"usage_in_usermode":0}
How can I get the total CPU usage for a docker container which has finished it's work?

How do I find out why my Docker service is in Pending state?

I'm playing Docker Swarm, and using a docker-compose.yml with docker deploy. All services get deployed except for one, which stays in Pending state. I have added a constraint that ties this service to one of the nodes. My question is not so much about this particular problem, but more about how to troubleshoot. The Docker documentation mentions possible causes for a service to be in pending state, but none of those apply (constraint problem, resource drainage).
Can I see the docker swarm 'thought process' somewhere? What is it thinking?
Edit: should have made it more clear that I am using the new, 1.12-introduced, swarm option.
I will explain how to debug when the service does not start as expected in the docker swarm mode.
First of all, get a task ID with docker service ps <service-name>.
Next, it is useful to check meta data with docker inspect <task-id>. In particular, Error message before container start is in the status field, and then reconfirm whether it was started with the intended parameters.
If the task has container ID, it was abnormally exited after starting the container, so check the log of the container with docker logs <container-id>
I hope this would be some of help.
docker run swarm has a --debug option which can tell you more.
See docker swarm issue 2341 or docker issue 24982 to see that option used to debug pending states.
For instance:
(unknown): 192.168.99.106:2375(node2 ip)
└ ID:
└ Status: Pending
└ Containers: 0
└ Reserved CPUs: 0 / 0
└ Reserved Memory: 0 B / 0 B
└ Labels:
└ Error: Cannot connect to the Docker daemon. Is the docker daemon running on this host?....

Resources