GKE private cluster Django server timeout 502 after 64 API requests - docker

So I have a GKE private cluster production ready environment, where I host my Django Rest Framework microservices.
It all works fine, but after 64 API requests, the server does a timeout and the pod is unreachable.
I am not sure why this is happening.
I use the following stack:
Django 3.2.13
GKE v1.22.8-hke.201
Postgresql Cloud SQL
Docker
My Django application is a simple one. No authentication on POST. Just a small json-body is sent, and the server saves it to the PostgreSQL database.
The server connects via cloud_sql_proxy to the database, but I also tried to use the IP and the PySQL library. It works, but same error/timeout.
The workloads that are having the issue are any workloads that do a DB call, it does not matter if I do a SELECT * or INSERT call.
However, when I do a load balancing test (locust python) and test the home page of any microserver within the cluster (readiness) I do not experience any API calls timeout and server restart.
Type Name # reqs # fails | Avg Min Max Med | req/s failures/s
--------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|-----------
POST /api/logserver/logs/ 64 0(0.00%) | 96 29 140 110 | 10.00 0.00
--------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|-----------
Aggregated 64 0(0.00%) | 96 29 140 110 | 10.00 0.00
Type Name # reqs # fails | Avg Min Max Med | req/s failures/s
--------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|-----------
POST /api/logserver/logs/ 77 13(19.48%) | 92 17 140 100 | 0.90 0.00
--------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|-----------
Aggregated 77 13(19.48%) | 92 17 140 100 | 0.90 0.00
So It looks like it has something to do with the way the DB is connected to the pods?
I use cloud_sql_proxy to connect to the DB. And this also results in a Timeout and restart of the pod.
I have tried updating gunicorn in the docker environment for Django to:
CMD gunicorn -b :8080 --log-level debug --timeout 90 --workers 6 --access-logfile '-' --error-logfile '-' --graceful-timeout 30 helloadapta.wsgi
And I have tried replacing gunicorn with uwsgi.
I also tried using plain python manage.py runserver 0.0.0.0:8080
They all serve the backend, and I can connect to it. But the issue on timeout persists.
This is the infrastructure:
Private GKE cluster which uses subnetwork in GCP.
Cloud NAT on network for outbound external static IP (needed to whitelist microservers in third party servers)
The Cluster has more than enough memory and cpu:
nodes: 3
total vCPUs: 24
total memory: 96GB
Each node has:
CPU allocatable: 7.91 CPU
Memory allocatable: 29.79 GB
The config in the yaml file states that the pod gets:
resources:
limits:
cpu: "1"
memory: "2Gi"
requests:
cpu: "1"
memory: "2Gi"
Only when I do a readiness call to the server, there is no Timeout.
So it really points to a direction that the Cloud SQL breaks after 64 API calls.
The Cloud SQL Database stack is:
1 sql instance
1 database within the instance
4 vCPUs
15 GB memory
max_connections 85000
The CPU utilisation never goes above 5%

Related

Flink TaskManager Docker Swarm doesn't recover

I'm Running a Flink v1.10 with 1 JobManager and 3 Taskmanagers in Docker Swarm, without Zookeeper. I've a Job running taking 12 Slots and i've 3 TM's with 20 Slots each (60 total).
After some tests everything went well except one test.
So, the test failing is, if i cancel the job manually i've a side-car retrying the job and the Taskmanager on the Browser Console doesn't recover and keeps decreasing.
More pratical example, so, i've a job running, consuming 12 slots of 60 total.
The web console shows me 48 Slots free and 3 TM's.
I cancel the job manually the side-car retriggers the job and the web
console shows me 36 Slots free and 2 TM's
The job enter's in a fail state and the Slot's will keep dreasing until 0 Slots free and 1 TM shows on the console.
The solution is scale down and scale up all the 3 TM's and everything get back to normal.
Everything work's fine with this configuration, the jobmanager recover's if i remove it, or if i scale up or down the TM's, but if i cancel the job the TM's looks like they loose the connection to the JM.
Any suggestions what i'm doing wrong?
Here is my flink-conf.yaml.
env.java.home: /usr/local/openjdk-8
env.log.dir: /opt/flink/
env.log.file: /var/log/flink.log
jobmanager.rpc.address: jobmanager1
jobmanager.rpc.port: 6123
jobmanager.heap.size: 2048m
#taskmanager.memory.process.size: 2048m
#env.java.opts.taskmanager: 2048m
taskmanager.memory.flink.size: 2048m
taskmanager.numberOfTaskSlots: 20
parallelism.default: 2
#==============================================================================
# High Availability
#==============================================================================
# The high-availability mode. Possible options are 'NONE' or 'zookeeper'.
#
high-availability: NONE
#high-availability.storageDir: file:///tmp/storageDir/flink_tmp/
#high-availability.zookeeper.quorum: zookeeper1:2181,zookeeper2:2181,zookeeper3:2181
#high-availability.zookeeper.quorum:
# ACL options are based on https://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#sc_BuiltinACLSchemes
# high-availability.zookeeper.client.acl: open
#==============================================================================
# Fault tolerance and checkpointing
#==============================================================================
# state.checkpoints.dir: hdfs://namenode-host:port/flink-checkpoints
# state.savepoints.dir: hdfs://namenode-host:port/flink-checkpoints
# state.backend.incremental: false
jobmanager.execution.failover-strategy: region
#==============================================================================
# Rest & web frontend
#==============================================================================
rest.port: 8080
rest.address: jobmanager1
# rest.bind-port: 8081
rest.bind-address: 0.0.0.0
#web.submit.enable: false
#==============================================================================
# Advanced
#==============================================================================
# io.tmp.dirs: /tmp
# classloader.resolve-order: child-first
# taskmanager.memory.network.fraction: 0.1
# taskmanager.memory.network.min: 64mb
# taskmanager.memory.network.max: 1gb
#==============================================================================
# Flink Cluster Security Configuration
#==============================================================================
# security.kerberos.login.use-ticket-cache: false
# security.kerberos.login.keytab: /mobi.me/flink/conf/smart3.keytab
# security.kerberos.login.principal: smart_user
# security.kerberos.login.contexts: Client,KafkaClient
#==============================================================================
# ZK Security Configuration
#==============================================================================
# zookeeper.sasl.login-context-name: Client
#==============================================================================
# HistoryServer
#==============================================================================
#jobmanager.archive.fs.dir: hdfs:///completed-jobs/
#historyserver.web.address: 0.0.0.0
#historyserver.web.port: 8082
#historyserver.archive.fs.dir: hdfs:///completed-jobs/
#historyserver.archive.fs.refresh-interval: 10000
blob.server.port: 6124
query.server.port: 6125
taskmanager.rpc.port: 6122
high-availability.jobmanager.port: 50010
zookeeper.sasl.disable: true
#recovery.mode: zookeeper
#recovery.zookeeper.quorum: zookeeper1:2181,zookeeper2:2181,zookeeper3:2181
#recovery.zookeeper.path.root: /
#recovery.zookeeper.path.namespace: /cluster_one
The solution was to increate the metaspace size in the flink-conf.yaml.
Br,
André.

Elasticsearch query slow response via kibana console

Server background : 3 node elasticsearch cluster + kibana + logstash running on docker environment. host server runs rhel7.7(2cpu, 8GB RAM + 200GB fileshare).
Versions :
elasticsearch 7.5.1
kibana 7.5.1
logstash 7.5.1
filebeat 7.5.1 (runs on separate server)
## Cluster health status
{
"cluster_name" : "es-cluster",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 3,
"number_of_data_nodes" : 3,
"active_primary_shards" : 116,
"active_shards" : 232,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
}
## Node status
172.20.1.3 60 91 13 0.98 1.30 1.45 dilm - elasticsearch2
172.20.1.4 57 91 13 0.98 1.30 1.45 dilm - elasticsearch3
172.20.1.2 61 91 14 0.98 1.30 1.45 dilm * elasticsearch
## Host server TOP output
top - 11:37:10 up 11 days, 22:30, 3 users, load average: 0.74, 1.29, 1.47
Tasks: 210 total, 1 running, 209 sleeping, 0 stopped, 0 zombie
%Cpu(s): 4.4 us, 0.8 sy, 0.0 ni, 94.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 7999840 total, 712736 free, 5842300 used, 1444804 buff/cache
KiB Swap: 3071996 total, 2794496 free, 277500 used. 1669472 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
48491 vtha837 20 0 4003724 713564 23880 S 5.0 8.9 1:57.53 java
59023 vtha837 20 0 6796456 1.5g 172556 S 2.7 20.3 28:18.07 java
59006 vtha837 20 0 6827100 1.5g 176840 S 1.7 19.9 30:03.59 java
760 vtha837 20 0 6877220 1.5g 180752 S 0.7 19.9 24:37.88 java
59610 vtha837 20 0 1663436 258152 7336 S 0.3 3.2 16:51.84 node
## Kibana environment variables I used for kibana docker image
environment:
SERVER_NAME: "kibana"
SERVER_PORT: 9548
ELASTICSEARCH_PASSWORD: ${ES_PASSWORD}
ELASTICSEARCH_HOSTS: "http://elasticsearch:9550"
KIBANA_DEFAULTAPPID: "dashboard/Default"
LOGGING_QUIET: "true"
XPACK_SECURITY_ENCRYPTIONKEY: ${KIBANA_XPACK_SEC_KEY}
XPACK_SECURITY_SESSIONTIMEOUT: 600000
Issue :
A. When I run elasticsearch queries via kibana console at least took 20000 ms return output to the console. But if I run the same query directly(elasticsearch) via curl or postman or chrome it took only less than 200 ms to get the output
B. even this happening when I loading kibana dashboad(not all the time), get following error message and not loading some graphs. but I can't see any exceptions or errors from console logs
Error in visualization
[esaggs] > Request to Elasticsearch failed: {"error":{}}
If I refresh the page, I can see all the graphs.
Chrome performance profile directly hitting elasticsearch query URL: http://testnode.mycompany.com.nz:9550/_cat/indices
Chrome performance profile via kibana dev console elasticsearch query Query : GET /_cat/indices
What I not understand is If I run same docker compose file in my laptop(windoes 10, 16GB, i7 2cpu, docker desktop running) i'm not facing any slowness either kibana dev console query or directly query elasticseach.
Anyone having this issue and appreciate let me know how to fix this?
Thanks in advance.
Issue is the docker service discovery. Due to some reason docker service discovery not happened. as soon as I change the elasticsearch host to IP get the real performance.
In docker-compose Previous kibana configuration
ELASTICSEARCH_HOSTS: "http://elasticsearch:9550"
New configuration
ELASTICSEARCH_HOSTS: "http://172.20.1.2:9550"
more details refer to the elasticsearch discuss page

Relationship between dask distributed pods, workers, CPU and RAM in config.yaml

When setting up a dask cluster using Helm, there are a set of variables in the config.yaml file for customizing the number of workers, and I'm hoping for some help with the terminology. For example, if I set up an Kubernetes cluster with 16 virtual machines, 8 cores/machine and 32GB/virtual machine, I end up with 128 vCPUs and 512GB memory. If I pass the "helm ... update -f config.yaml"
worker:
name: worker
allowed-failures: 2
replicas: 48
resources:
limits:
cpu: 2
memory: 8G
requests:
cpu: 2
memory: 8G
It seems like I should be able to create 64 workers with 2 cpus each, and use all of my 512 GB RAM. (Minus the resources dedicated to the scheduler). However, in practice, the distributed client tops out at 40 workers, 80 cores and 320 GB of total RAM.
Are there best practices around setting up pods to maximize the utilization of the cluster? I know from this post that the workload comes first, in terms of the use of threads and processes per worker, but should the number of workers == the number of cores == number of pods? If so, what is the role of the cpu keyword in the above .yaml file?
My first guess is that other things are running on your nodes, and so Kubernetes doesn't feel comfortable giving everything that you've asked for. For example, Kubernetes itself takes up some memory.

Why such an overhead for system with docker containers usage?

I have a following question. I was designing recently a java application on Spring, that works with a database. And I have decided to perform a stress testing. Both the application and the database reside on a virtual Debian machine. I tested it with gatling, and here is what I got:
request count 600 (OK=600 KO=0 )
min response time 12 (OK=12 KO=- )
max response time 159 (OK=159 KO=- )
mean response time 21 (OK=21 KO=- )
std deviation 13 (OK=13 KO=- )
response time 50th percentile 17 (OK=17 KO=- )
response time 75th percentile 22 (OK=22 KO=- )
mean requests/sec 10.01 (OK=10.01 KO=- )
t < 800 ms 600 (100%)
800 ms < t < 5000 ms 0 ( 0%)
t > 5000 ms 0 ( 0%)
failed 0 ( 0%)
So far, so good. Ater that, I decided to put the database and jar into two containers. Here is a docker-compose.yml sample for that:
prototype-db:
build: prototype-db
volumes:
- ./prototype-db/data:/var/lib/mysql:rw
- ./prototype-db/scripts:/docker-entrypoint-initdb.d:ro
ports:
- "3306"
prototype:
image: openjdk:8
command: bash -c "cd /deploy && java -jar application.jar"
volumes:
- ./application/target:/deploy
depends_on:
- prototype-db
ports:
- "8080:8080"
dns:
- 172.16.10.1
- 172.16.10.2
The Dockerfile looks like this:
FROM mysql:5.7.15
ENV MYSQL_DATABASE=document \
MYSQL_ROOT_PASSWORD=root \
MYSQL_USER=testuser \
MYSQL_PASSWORD=12345
EXPOSE 3306
Now, after testing that with gatling I've got the followin results:
---- Global Information --------------------------------------------------------
request count 6000 (OK=3946 KO=2054 )
min response time 0 (OK=124 KO=0 )
max response time 18336 (OK=18336 KO=77 )
mean response time 5021 (OK=7630 KO=10 )
std deviation 4136 (OK=2478 KO=9 )
response time 50th percentile 6516 (OK=8694 KO=9 )
response time 75th percentile 8732 (OK=8905 KO=14 )
mean requests/sec 87.433 (OK=57.502 KO=29.931)
---- Response Time Distribution ------------------------------------------------
t < 800 ms 65 ( 1%)
800 ms < t < 5000 ms 532 ( 9%)
t > 5000 ms 3349 ( 56%)
failed 2054 ( 34%)
---- Errors --------------------------------------------------------------------
java.io.IOException: Remotely closed 1494 (72.74%)
status.find.is(200), but actually found 500 560 (27.26%)
This is amazing - the mean response time exceeded drastically, and a lot of errors, but this docker compose system runs on the very same virtual debian machine. What could cause exactly such an overhead, I thought that docker containers are a lot like native processed, they should not be running that slow.

Rethinkdb container: rethinkdb process takes less RAM than the whole container

I'm running my rethinkdb container in Kubernetes cluster. Below is what I notice:
Running top in the host which is CoreOS, rethinkdb process takes about 3Gb:
$ top
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
981 root 20 0 53.9m 34.5m 20.9m S 15.6 0.4 1153:34 hyperkube
51139 root 20 0 4109.3m 3.179g 22.5m S 15.0 41.8 217:43.56 rethinkdb
579 root 20 0 707.5m 76.1m 19.3m S 2.3 1.0 268:33.55 kubelet
But running docker stats to check the rethinkdb container, it takes about 7Gb!
$ docker ps | grep rethinkdb
eb9e6b83d6b8 rethinkdb:2.1.5 "rethinkdb --bind al 3 days ago Up 3 days k8s_rethinkdb-3.746aa_rethinkdb-rc-3-eiyt7_default_560121bb-82af-11e5-9c05-00155d070266_661dfae4
$ docker stats eb9e6b83d6b8
CONTAINER CPU % MEM USAGE/LIMIT MEM % NET I/O
eb9e6b83d6b8 4.96% 6.992 GB/8.169 GB 85.59% 0 B/0 B
$ free -m
total used free shared buffers cached
Mem: 7790 7709 81 0 71 3505
-/+ buffers/cache: 4132 3657
Swap: 0 0 0
Can someone explain why the container is taking a lot more memory than the rethinkdb process itself?
I'm running docker v1.7.1, CoreOS v773.1.0, kernel 4.1.5
In top command, your are looking at physical memory amount. in stats command, this also include the disk cached ram, so it's always bigger than the physical amount of ram. When you really need more RAM, the disk cached will be released for the application to use.
In deed, the memmory usage is pulled via cgroup memory.usage_in_bytes, you can access it in /sys/fs/cgroup/memory/docker/long_container_id/memory.usage_in_bytes. And acording to linux doc https://www.kernel.org/doc/Documentation/cgroups/memory.txt section 5.5:
5.5 usage_in_bytes
For efficiency, as other kernel components, memory cgroup uses some
optimization to avoid unnecessary cacheline false sharing.
usage_in_bytes is affected by the method and doesn't show 'exact'
value of memory (and swap) usage, it's a fuzz value for efficient
access. (Of course, when necessary, it's synchronized.) If you want to
know more exact memory usage, you should use RSS+CACHE(+SWAP) value in
memory.stat(see 5.2).

Resources