Regarding memory consumption in Kafka - memory

We are having a query regarding memory consumption on kafka scale out. It would be very helpful if you can give suggestion/solution for the below query.
We are running kafka as docker container on kubernetes.
Memory limit of 4GiB is configured for Kafka broker POD. With some large load Kafka broker POD’s memory reached 4GiB. So we decided to manually scale out Kafka broker POD replicas from 1 to 3. But after scale out, for same load each Kafka broker PODs are consuming 4GiB memory. We expected Kafka broker POD’s memory consumption to ~1.33GiB as we are running 3 PODs for same amount of load.
Before Kafka Broker Scale out :
1 broker
6 topics each with 1 partition each
Memory consumption: 4GiB
After Kafka Broker Scale out and rebalancing topics over all the brokers:
3 broker
6 topics each with 1 partitions each
Memory consumption: 10GiB (Pod1: 2GiB, Pod2: 4GiB, Pod3: 4GiB)
After Kafka Broker Scale out and rebalancing topics over all the brokers:
3 broker
6 topics each with 3 partitions each
Memory consumption: 12GiB (Pod1: 4GiB, Pod2: 4GiB, Pod3: 4GiB)
All deployments are tested with same amount of load. Replication factor for all topics is 1.
Edit 1

Related

kafka broker memory imbalance

We have a three-node Kafka cluster with the machine configuration:
node0 32core 128g
node1 4core 8g
node2 32core 128g
Only XMS is configured for all three nodes and XMX is not specified
But after a while,
Node0 occupies 12.7 GB of memory
Node0 top
Node1 occupies 2.1 GB of memory
Node1 top
Node2 occupies 3.1g memory
Node2 top
For node1, the JVM XMX defaults to 25% of the machine's memory
Arthas has the following memory information for node0:
node jvm dashboard
In arthas's dashboard, node0's heap memory footprint is similar to node2's, but the MMAP portion is much larger.
I have two questions:
Why is the memory usage of kafka's three nodes so different?
Why is node0's RES memory smaller than Arthas's in-heap + out-of-heap sum

Docker daemon memory consumption grows over time

Here's the scenario:
On a Debian GNU/Linux 9 (stretch) VM I have two containers running. The day before yesterday I got a warning from the monitoring that the memory usage is relatively high. After looking at the VM it could be determined that not the containers but Docker daemon needs them. htop
After a restart of the service I noticed a new increase of memory demand after two days. See graphic.
RAM + Swap overview
Is there a known memory leak for this version?
Docker version
Memory development (container) after 2 days:
Container 1 is unchanged
Container 2 increased from 21.02MiB to 55MiB
Memory development (VM) after 2 days:
The MEM increased on the machine from 273M (after reboot) to 501M
dockerd
- after restart 1.3% MEM%
- 2 days later 6.0% of MEM%
Monitor your containers to see if their memory usage changes over time:
> docker stats
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
623104d00e43 hq 0.09% 81.16MiB / 15.55GiB 0.51% 6.05kB / 0B 25.5MB / 90.1kB 3
We saw a similar issue and it seems to have been related to the gcplogs logging driver. We saw the problem on docker 19.03.6 and 19.03.9 (the most up-to-date that we can easily use).
Switching back to using a log forwarding container (e.g. logspout) resolved the issue for us.

Network usage data between every two pairs of docker containers

I have a few micro-services running in docker containers (One service in each container).
How do I find out the network usage between every two pair of docker containers? So that I can make a graph such that I have containers as vertices and on edges I have the amount of bytes transmitted/received.
I used cAdvisor, but it gives me the overall network usage of each container.
Start with docker stats:
$ docker stats
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
a645ca0d8feb wp_wordpress_1 0.00% 135.5MiB / 15.55GiB 0.85% 12.2MB / 5.28MB 2.81MB / 9.12MB 11
95e1649c5b79 wp_db_1 0.02% 203MiB / 15.55GiB 1.28% 6.44MB / 3.11MB 0B / 1.08GB 30

Relationship between dask distributed pods, workers, CPU and RAM in config.yaml

When setting up a dask cluster using Helm, there are a set of variables in the config.yaml file for customizing the number of workers, and I'm hoping for some help with the terminology. For example, if I set up an Kubernetes cluster with 16 virtual machines, 8 cores/machine and 32GB/virtual machine, I end up with 128 vCPUs and 512GB memory. If I pass the "helm ... update -f config.yaml"
worker:
name: worker
allowed-failures: 2
replicas: 48
resources:
limits:
cpu: 2
memory: 8G
requests:
cpu: 2
memory: 8G
It seems like I should be able to create 64 workers with 2 cpus each, and use all of my 512 GB RAM. (Minus the resources dedicated to the scheduler). However, in practice, the distributed client tops out at 40 workers, 80 cores and 320 GB of total RAM.
Are there best practices around setting up pods to maximize the utilization of the cluster? I know from this post that the workload comes first, in terms of the use of threads and processes per worker, but should the number of workers == the number of cores == number of pods? If so, what is the role of the cpu keyword in the above .yaml file?
My first guess is that other things are running on your nodes, and so Kubernetes doesn't feel comfortable giving everything that you've asked for. For example, Kubernetes itself takes up some memory.

RabbitMQ doesn't respects Docker memory limits

I have run my docker container containing RabbitMQ instance. Ia have used docker run command with three parameters (among others):
-m 300m
--kernel-memory="300m"
--memory-swap="400m"
docker stats shows:
> CONTAINER CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O
> 1f50929f8e4e 0.40% 126.8 MB / 349.2 MB 36.30% 908.2 kB / 1.406 MB 24.69 MB / 1.114 MB
I have expected that RabbitMQ will see only 300MB of RAM memory, but high watermark visible on Rabbit UI shows 5,3GB. My host has 8GB available so, probably RabbitMQ read memory size from host.

Resources