SystemMemoryExceedsReservation

SystemMemoryExceedsReservation - memory

Hey what does this rule say?
expr: |
sum by (node) (container_memory_rss{id="/system.slice"}) > ((sum by (node) (kube_node_status_capacity{resource="memory"} - kube_node_status_allocatable{resource="memory"})) * 0.95)
for: 15m
labels:
severity: warning
We are receiving SNOW tickets, saying "System memory usage of 1.161G on exceeds 95% of the reservation". But upon checking there is no memory spike, as a matter of fact node limits has crossed 131%.

Related

Prometheus/Alertmanager inhibit rules

I have the following inhibit rule:
inhibit_rules:
- source_match:
severity: 'critical'
target_match:
severity: 'high'
equal: ['alertname]
and also two alerts accordingly with severity high and critical.
Alert 1
- alert: ContainerCpuUsage
expr: ContainerCpuUsage > 90
for: 30m
labels:
severity: high
topic: container
annotations:
summary: "Container CPU usage for pod '{{ $labels.pod }}' is above 90% for the last 30 minutes."
description: "Container CPU usage (name {{ $labels.pod }})\nMeasuredValue={{ printf \"%.2f\" $value }}%"
Alert 2
- alert: ContainerCpuUsage
expr: ContainerCpuUsage > 98
for: 30m
labels:
severity: critical
topic: container
annotations:
summary: "Container CPU usage for pod '{{ $labels.pod }}' is above 98% for the last 30 minutes."
description: "Container CPU usage (name {{ $labels.pod }})\nMeasuredValue={{ printf \"%.2f\" $value }}%"
The idea is when the CPU usage goes suddenly from 20%, let say, to 99% a critical alert should be fired and also a high alert should not be fired.With inhibit rules above it works perfectly.
But when the CPU usage goes suddenly from 20%, let say, to 91% a high alert is fired and this is correct.After some min if CPU usage goes further to 99% a second alert,a critical one is also fired.So i have in total 2 alerts open,high and critical.
What i want is that if CPU usage >98% high alert should be closed and only critical remains open.Why high alert is not closed/inhibit?
If an alert is already fired,can inhibit rules close it?

The "inhibit_rules" just mutes the alerts, in other words, it prevents sending new notifications (emails, messages, etc) to recipients about the alerts, but does not inactive them.

Prometheus blackbox probe helpful metrics

I have around 1000 targets that are probed using HTTP.
job="http_2xx", env="prod", instance="x.x.x.x"
job="http_2xx", env="test", instance="y.y.y.y"
job="http_2xx", env="dev", instance="z.z.z.z"
I want to know for the targets:
Rate of failure by env in last 10 minutes.
Increase in rate of failure by env in last 10 minutes.
Curious what the following does:
sum(increase(probe_success{job="http_2xx"}[10m]))
rate(probe_success{job="http_2xx", env="prod"}[5m]) * 100
The closest I have reached is with following to find operational by env in 10 minutes:
avg(avg_over_time(probe_success{job="http_2xx", env="prod"}[10m]) * 100)

Rate of failure by env in last 10 minutes. The easiest way you can do it is:
sum(rate(probe_success{job="http_2xx"}[10m]) * 100) by (env)
This will return you the percentage off successful probes, which you can reverse adding *(-1) +100
Calculating rate over 10m and increase of rate over 10m seems redundant adding an increase function to the above query didn't work for me. you can replace the rate function with increase if want to.
The first query was pretty close it will calculate the increase of successful probes over 10m period. You can make it show increase of failed probes by adding == 0 and sum it by the "env" variable
sum(increase(probe_success{job="http_2xx"} == 0 [10m])) by (env)
Your second query will return percentage of successful request over 5m for prod environment

The memory of cgroup rss is much higher than the summary of the memory usage of all processes in the docker container

I hava a Redis runing in a container .
Inside the container cgroup rss show using about 1283MB memory.
The kmem memory usage is 30.75MB.
The summary of the memory usage of all processes in the docker container is 883MB.
How can i figure out the "disappeared memory "(1296-883-30=383MB).The "disappeared memory" will growing with the time pass.Flinally the container will be oom killed .
environmet info is
redis version:4.0.1
docker version:18.09.9
k8s version:1.13
**the memory usage is 1283MB **
root#redis-m-s-rbac-0:/opt#cat /sys/fs/cgroup/memory/memory.usage_in_bytes
1346289664 >>>> 1283.921875 MB
the kmem memory usage is 30.75MB
root#redis-m-s-rbac-0:/opt#cat /sys/fs/cgroup/memory/memory.kmem.usage_in_bytes
32194560 >>> 30.703125 MB
root#redis-m-s-rbac-0:/opt#cat /sys/fs/cgroup/memory/memory.stat
cache 3358720
rss 1359073280 >>> 1296.11328125 MB
rss_huge 515899392
shmem 0
mapped_file 405504
dirty 0
writeback 0
swap 0
pgpgin 11355630
pgpgout 11148885
pgfault 25710366
pgmajfault 0
inactive_anon 0
active_anon 1359245312
inactive_file 2351104
active_file 1966080
unevictable 0
hierarchical_memory_limit 4294967296
hierarchical_memsw_limit 4294967296
total_cache 3358720
total_rss 1359073280
total_rss_huge 515899392
total_shmem 0
total_mapped_file 405504
total_dirty 0
total_writeback 0
total_swap 0
total_pgpgin 11355630
total_pgpgout 11148885
total_pgfault 25710366
total_pgmajfault 0
total_inactive_anon 0
total_active_anon 1359245312
total_inactive_file 2351104
total_active_file 1966080
total_unevictable 0
**the summary of the memory usage of all processes in the docker container is 883MB **
root#redis-m-s-rbac-0:/opt#ps aux | awk '{sum+=$6} END {print sum / 1024}'
883.609

This is happening because usage_in_bytes does not show exact value of memory and swap usage. The memory.usage_in_bytes show current memory(RSS+Cache) usage.
5.5 usage_in_bytes For efficiency, as other kernel components, memory cgroup uses some optimization to avoid unnecessary cacheline
false sharing. usage_in_bytes is affected by the method and doesn't
show 'exact' value of memory (and swap) usage, it's a fuzz value for
efficient access. (Of course, when necessary, it's synchronized.) If
you want to know more exact memory usage, you should use
RSS+CACHE(+SWAP) value in memory.stat(see 5.2).
Reference:
https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt

All allocated Tarantool 2.3 space memtx is occupied

Today the allocated space for memxt tarantool is over - memtx_memory = 5GB, the RAM was really busy at 5GB, after restarting tarantool more than 4GB was freed.
What could be clogged with RAM? What settings can this be related to?
box.slab.info()
---
- items_size: 1308568936
items_used_ratio: 91.21%
quota_size: 5737418240
quota_used_ratio: 13.44%
arena_used_ratio: 89.2%
items_used: 1193572600
quota_used: 1442840576
arena_size: 1442840576
arena_used: 1287551224
box.info()
---
- version: 2.3.2-26-g38e825b
id: 1
ro: false
uuid: d9cb7d78-1277-4f83-91dd-9372a763aafa
package: Tarantool
cluster:
uuid: b6c32d07-b448-47df-8967-40461a858c6d
replication:
1:
id: 1
uuid: d9cb7d78-1277-4f83-91dd-9372a763aafa
lsn: 89759968433
2:
id: 2
uuid: 77557306-8e7e-4bab-adb1-9737186bd3fa
lsn: 9
3:
id: 3
uuid: 28bae7dd-26a8-47a7-8587-5c1479c62311
lsn: 0
4:
id: 4
uuid: 6a09c191-c987-43a4-8e69-51da10cc3ff2
lsn: 0
signature: 89759968442
status: running
vinyl: []
uptime: 606297
lsn: 89759968433
sql: []
gc: []
pid: 32274
memory: []
vclock: {2: 9, 1: 89759968433}
cat /etc/tarantool/instances.available/my_app.lua
...
memtx_memory = 5 * 1024 * 1024 * 1024,
...
Tarantool vesrion 2.3.2, OS CentOs 7
https://i.stack.imgur.com/onV44.png

It's the result of a process called fragmentation.
The simple reason for this process is the next situation:
you have some allocated area for tuples
you put one tuple and next you put another one
when you need to increase the first tuple, a database needs to relocate your tuple at another place with enough capacity. After that, the place for the first tuple will be free but we took the new place for the extended tuple.
You can decrease a fragmentation factor by increasing a tuple size for your case.
Choose the size by estimating your typical data or just find the optimal size via metrics of your workload for a time.

Single-threaded program profiles 15% of runtime in semaphore_wait_trap

On Mac OS using mono, if I compile and profile the program below, I get the following results:
% fsharpc --nologo -g foo.fs -o foo.exe
% mono --profile=default:stat foo.exe
...
Statistical samples summary
Sample type: cycles
Unmanaged hits: 336 (49.1%)
Managed hits: 349 (50.9%)
Unresolved hits: 1 ( 0.1%)
Hits % Method name
154 22.48 Microsoft.FSharp.Collections.SetTreeModule:height ...
105 15.33 semaphore_wait_trap
74 10.80 Microsoft.FSharp.Collections.SetTreeModule:add ...
...
Note the second entry, semaphore_wait_trap.
Here is the program:
[<EntryPoint>]
let main args =
let s = seq { 1..1000000 } |> Set.ofSeq
s |> Seq.iter (fun _ -> ())
0
I looked in the source for the Set module, but I didn't find any (obvious) locking.
Is my single-threaded program really spending 15% of its execution time messing with semaphores? If it is, can I make it not do that and get a performance boost?

According to Instruments, it's sgen/gc calling semaphore_wait_trap:
Sgen is documented as stopping all other threads while it collects:
Before doing a collection (minor or major), the collector must stop
all running threads so that it can have a stable view of the current
state of the heap, without the other threads changing it
In other words, when the code is trying to allocate memory and a GC is required, the time it takes shows up under semaphore_wait_trap since that's your application thread. I suspect the mono profiler doesn't profile the gc thread itself so you don't see the time in the collection code.
The germane output then is really the GC summary:
GC summary
GC resizes: 0
Max heap size: 0
Object moves: 1002691
Gen0 collections: 123, max time: 14187us, total time: 354803us, average: 2884us
Gen1 collections: 3, max time: 41336us, total time: 60281us, average: 20093us
If you want your code to run faster, don't collect as often.
Understanding the actual cost of collection can be done through dtrace since sgen has dtrace probes.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

SystemMemoryExceedsReservation - memory

Related

Prometheus/Alertmanager inhibit rules

Prometheus blackbox probe helpful metrics

The memory of cgroup rss is much higher than the summary of the memory usage of all processes in the docker container

All allocated Tarantool 2.3 space memtx is occupied

Single-threaded program profiles 15% of runtime in semaphore_wait_trap

Categories

Resources