Docker daemon out of memory but memory available on host - docker

I have been looking here and elsewhere on Internet and didn't find answer to my problem.
Here is the situation.
My system is not running out of memory:
System information as of Sat May 23 12:06:56 CEST 2020
System load: 0.02 Users logged in: 0
Usage of /: 72.8% of 38.71GB IP address for ens3: 92.222.89.93
Memory usage: 12% IP address for docker_gwbridge: 172.18.0.1
Swap usage: 0% IP address for docker0: 172.17.0.1
Processes: 126
My docker daemon does not show any problem:
top - 12:14:46 up 2 days, 22:11, 1 user, load average: 37.87, 21.54, 26.91
Tasks: 187 total, 2 running, 140 sleeping, 0 stopped, 1 zombie
%Cpu(s): 0.8 us, 92.4 sy, 0.0 ni, 0.0 id, 6.5 wa, 0.0 hi, 0.3 si, 0.0 st
KiB Mem : 3941032 total, 118616 free, 3720504 used, 101912 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 14676 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
34 root 20 0 0 0 0 R 71.6 0.0 28:08.90 kswapd0
11405 root 20 0 117504 3920 0 S 0.9 0.1 0:00.32 caddy
1005 root 20 0 934736 61288 0 S 0.7 1.6 19:09.86 dockerd
But I get an out of memory error when I invoke docker commands:
# docker ps
fatal error: runtime: out of memory
runtime stack:
runtime.throw(0x55d00d234abc, 0x16)
/usr/local/go/src/runtime/panic.go:617 +0x74 fp=0x7ffc081913c0 sp=0x7ffc08191390 pc=0x55d00bc58574
runtime.sysMap(0xc000000000, 0x4000000, 0x55d00f1ddf98)
/usr/local/go/src/runtime/mem_linux.go:170 +0xc9 fp=0x7ffc08191400 sp=0x7ffc081913c0 pc=0x55d00bc43889
runtime.(*mheap).sysAlloc(0x55d00f1c4a80, 0x2000, 0x55d00f1c4a90, 0x1)
/usr/local/go/src/runtime/malloc.go:633 +0x1cf fp=0x7ffc081914a8 sp=0x7ffc08191400 pc=0x55d00bc3669f
runtime.(*mheap).grow(0x55d00f1c4a80, 0x1, 0x0)
/usr/local/go/src/runtime/mheap.go:1222 +0x44 fp=0x7ffc08191500 sp=0x7ffc081914a8 pc=0x55d00bc50c94
runtime.(*mheap).allocSpanLocked(0x55d00f1c4a80, 0x1, 0x55d00f1ddfa8, 0x0)
/usr/local/go/src/runtime/mheap.go:1150 +0x381 fp=0x7ffc08191538 sp=0x7ffc08191500 pc=0x55d00bc50b81
runtime.(*mheap).alloc_m(0x55d00f1c4a80, 0x1, 0x2a, 0x6e43a318)
/usr/local/go/src/runtime/mheap.go:977 +0xc6 fp=0x7ffc08191588 sp=0x7ffc08191538 pc=0x55d00bc501d6
runtime.(*mheap).alloc.func1()
/usr/local/go/src/runtime/mheap.go:1048 +0x4e fp=0x7ffc081915c0 sp=0x7ffc08191588 pc=0x55d00bc812fe
runtime.(*mheap).alloc(0x55d00f1c4a80, 0x1, 0x55d00b01002a, 0x7ffc08191660)
/usr/local/go/src/runtime/mheap.go:1047 +0x8c fp=0x7ffc08191610 sp=0x7ffc081915c0 pc=0x55d00bc504ac
runtime.(*mcentral).grow(0x55d00f1c5880, 0x0)
/usr/local/go/src/runtime/mcentral.go:256 +0x97 fp=0x7ffc08191658 sp=0x7ffc08191610 pc=0x55d00bc43307
runtime.(*mcentral).cacheSpan(0x55d00f1c5880, 0x7ff733676000)
/usr/local/go/src/runtime/mcentral.go:106 +0x301 fp=0x7ffc081916b8 sp=0x7ffc08191658 pc=0x55d00bc42e11
runtime.(*mcache).refill(0x7ff733676008, 0x2a)
/usr/local/go/src/runtime/mcache.go:135 +0x88 fp=0x7ffc081916d8 sp=0x7ffc081916b8 pc=0x55d00bc428a8
runtime.(*mcache).nextFree(0x7ff733676008, 0x55d00f1ba92a, 0x7ff733676008, 0x7ff733676000, 0x8)
/usr/local/go/src/runtime/malloc.go:786 +0x8a fp=0x7ffc08191710 sp=0x7ffc081916d8 pc=0x55d00bc36eda
runtime.mallocgc(0x180, 0x55d00df28480, 0x1, 0x55d00f1de000)
/usr/local/go/src/runtime/malloc.go:939 +0x780 fp=0x7ffc081917b0 sp=0x7ffc08191710 pc=0x55d00bc37810
runtime.newobject(0x55d00df28480, 0x4000)
/usr/local/go/src/runtime/malloc.go:1068 +0x3a fp=0x7ffc081917e0 sp=0x7ffc081917b0 pc=0x55d00bc37c1a
runtime.malg(0x22b1800008000, 0x55d00f1c70f0)
/usr/local/go/src/runtime/proc.go:3220 +0x33 fp=0x7ffc08191820 sp=0x7ffc081917e0 pc=0x55d00bc61a23
runtime.mpreinit(...)
/usr/local/go/src/runtime/os_linux.go:311
runtime.mcommoninit(0x55d00f1bed40)
/usr/local/go/src/runtime/proc.go:618 +0xc6 fp=0x7ffc08191858 sp=0x7ffc08191820 pc=0x55d00bc5b396
runtime.schedinit()
/usr/local/go/src/runtime/proc.go:540 +0x78 fp=0x7ffc081918b0 sp=0x7ffc08191858 pc=0x55d00bc5b028
runtime.rt0_go(0x7ffc081919b8, 0x2, 0x7ffc081919b8, 0x0, 0x7ff732ca9b97, 0x2, 0x7ffc081919b8, 0x200008000, 0x55d00bc83370, 0x0, ...)
/usr/local/go/src/runtime/asm_amd64.s:195 +0x11e fp=0x7ffc081918b8 sp=0x7ffc081918b0 pc=0x55d00bc8349e
Any suggestion is welcome although I am not sure to be able to reproduce the problem

After I set a memory limit on one of my container as advised by #John Manko the problem didn't arise anymore. It seems that this fixed my problem.

Related

already use docker limit, why server cpu load average still high

cpu:8
memory:64G
when use docker swarm to deploy application, already limit cpu and memory
docker service update java --reserved-cpu 2 --limit-cpu 2 --limit-memory 4G --reserve-memory 4G
use top to check server,
top - 11:09:26 up 889 days, 19:31, 16 users, load average: 72.50, 78.28, 55.54
Tasks: 271 total, 2 running, 183 sleeping, 0 stopped, 0 zombie
%Cpu(s): 36.7 us, 7.2 sy, 0.0 ni, 52.4 id, 0.0 wa, 0.0 hi, 3.7 si, 0.0 st
KiB Mem : 65916324 total, 27546636 free, 8884904 used, 29484784 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 56442876 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
15566 root 20 0 12.563g 2.871g 17580 S 199.0 4.6 42:06.92 java
45076 root 20 0 19.018g 337692 16680 S 167.2 0.5 0:07.38 java
14692 root 20 0 1941688 122152 50868 S 4.0 0.2 617:42.71 dockerd
have not other application, why load average run so high
could someone help to check it,thx

understanding docker container cpu usages

docker stats shows that the cpu usage to be very high. But top command out shows that 88.3% cpu is not being used. Inside the container is a java service httpthrift service.
docker stats :
CONTAINER CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
8a0488xxxx5 540.9% 41.99 GiB / 44 GiB 95.43% 0 B / 0 B 0 B / 35.2 MB 286
top output :
top - 07:56:58 up 2 days, 22:29, 0 users, load average: 2.88, 3.01, 3.05
Tasks: 13 total, 1 running, 12 sleeping, 0 stopped, 0 zombie
%Cpu(s): 8.2 us, 2.7 sy, 0.0 ni, 88.3 id, 0.0 wa, 0.0 hi, 0.9 si, 0.0 st
KiB Mem: 65959920 total, 47983628 used, 17976292 free, 357632 buffers
KiB Swap: 7999484 total, 0 used, 7999484 free. 2788868 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
8823 root 20 0 58.950g 0.041t 21080 S 540.9 66.5 16716:32 java
How to reduce the cpu usage and bring it under 100%?
According to the top man page:
When operating in Solaris mode (`I' toggled Off), a task's cpu usage will be divided by the total number of CPUs. After issuing this command, you'll be told the new state of this toggle.
So by pressing the key I when using top in interactive mode, you will switch to the Solaris mode and the CPU usage will be divided by the total number of CPUs (or cores).
P.S.: This option is not available on all versions of top.

Finding the memory consumption of each redis DB

The problem
One of my Python Redis clients fails with the following exception:
redis.exceptions.ResponseError: MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled. Please check Redis logs for details about the error.
I have checked the redis machine, and it seems to be out of memory:
free
total used free shared buffers cached
Mem: 3952 3656 295 0 1 9
-/+ buffers/cache: 3645 306
Swap: 0 0 0
top
top - 15:35:03 up 14:09, 1 user, load average: 0.06, 0.17, 0.16
Tasks: 114 total, 2 running, 112 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.2 us, 0.3 sy, 0.0 ni, 99.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.2 st
KiB Mem: 4046852 total, 3746772 used, 300080 free, 1668 buffers
KiB Swap: 0 total, 0 used, 0 free. 11364 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1102 root 20 0 3678836 3.485g 736 S 1.3 90.3 10:12.53 redis-server
1332 ubuntu 20 0 41196 3096 972 S 0.0 0.1 0:00.12 zsh
676 root 20 0 10216 2292 0 S 0.0 0.1 0:00.03 dhclient
850 syslog 20 0 255836 2288 124 S 0.0 0.1 0:00.39 rsyslogd
I am using a few dozens Redis DBs in a single Redis instance. Each DB is denoted by numeric ids given to redis-cli, e.g.:
$ redis-cli -n 80
127.0.0.1:6379[80]>
How do I know how much memory does each DB consume, and what are the largest keys in each DB?
How do I know how much memory does each DB consume, and what are the largest keys in each DB?
You CANNOT get the used memory for each DB. With INFO command, you can only get the totally used memory for Redis instance. Redis records the newly allocated memory size, each time it dynamically allocates some memory. However, it doesn't do such record for each DB. Also, it doesn't have any record for the largest keys.
Normally, you should config your Redis instance with the maxmemory and maxmemory-policy (i.e. eviction policy when the maxmemory is reached).
You can write some sh-script like to this (show element count in each DB):
#!/bin/bash
max_db=501
i=0
while [ $i -lt $max_db ]
do
echo "db_nubner: $i"
redis-cli -n $i dbsize
i=$((i+1))
done
Example output:
db_nubner: 0
(integer) 71
db_nubner: 1
(integer) 0
db_nubner: 2
(integer) 1
db_nubner: 3
(integer) 1
db_nubner: 4
(integer) 0
db_nubner: 5
(integer) 1
db_nubner: 6
(integer) 28
db_nubner: 7
(integer) 1
I know that we can have a one database with large key, but anyway, in some cases this script can help.

Any hook for Docker killed by out of memory

I'm running a docker for a daemon job. And the container will be killed every several hours. I'd like to add some hook (callback), such as:
restart the container and then run some commands on the restarted container
Is it possible to do that with Docker?
Otherwise, is there any better approach to detect the behaviour with Python or Ruby?
java invoked oom-killer: gfp_mask=0xd0, order=0, oom_score_adj=0
java cpuset=bcb33ac552c23cfa531814fbc3a64ae5cd8d85aa19245e1560e0ce3e3310c798 mems_allowed=0
CPU: 3 PID: 14182 Comm: java Not tainted 4.1.0-x86_64-linode59 #1
0000000000000000 ffff8800dc520800 ffffffff8195b396 ffff880002cf5ac0
ffffffff81955e58 ffff8800a2918c38 ffff8800f43c3e78 0000000000000000
ffff8800b5f687f0 000000000000000d ffffea0002d7da30 ffff88005bebdec0
Call Trace:
[<ffffffff8195b396>] ? dump_stack+0x40/0x50
[<ffffffff81955e58>] ? dump_header+0x7b/0x1fe
[<ffffffff8119655d>] ? __do_fault+0x3f/0x79
[<ffffffff811789d6>] ? find_lock_task_mm+0x2c/0x7b
[<ffffffff81961c55>] ? _raw_spin_unlock_irqrestore+0x2d/0x3e
[<ffffffff81178dee>] ? oom_kill_process+0xc5/0x387
[<ffffffff811789d6>] ? find_lock_task_mm+0x2c/0x7b
[<ffffffff811b76be>] ? mem_cgroup_oom_synchronize+0x3ad/0x4c7
[<ffffffff811b6c92>] ? mem_cgroup_is_descendant+0x29/0x29
[<ffffffff811796e7>] ? pagefault_out_of_memory+0x1c/0xc1
[<ffffffff81963e58>] ? page_fault+0x28/0x30
Task in /docker/bcb33ac552c23cfa531814fbc3a64ae5cd8d85aa19245e1560e0ce3e3310c798 killed as a result of limit of /docker/bcb33ac552c23cfa531814fbc3a64ae5cd8d85aa19245e1560e0ce3e3310c798
memory: usage 524288kB, limit 524288kB, failcnt 14716553
memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0
kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
Memory cgroup stats for /docker/bcb33ac552c23cfa531814fbc3a64ae5cd8d85aa19245e1560e0ce3e3310c798: cache:72KB rss:524216KB rss_huge:0KB mapped_file:64KB writeback:0KB inactive_anon:262236KB active_anon:262044KB inactive_file:4KB active_file:4KB unevictable:0KB
[ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name
[14097] 1000 14097 5215 20 17 3 47 0 entry_point.sh
[14146] 0 14146 11960 0 30 3 101 0 sudo
[14150] 1000 14150 1112 7 8 3 22 0 xvfb-run
[14162] 1000 14162 51929 11220 90 3 95 0 Xvfb
[14169] 1000 14169 658641 18749 120 6 0 0 java
[14184] 1000 14184 28364 555 58 3 0 0 fluxbox
[24639] 1000 24639 5212 59 16 3 0 0 bash
Memory cgroup out of memory: Kill process 14169 (java) score 96 or sacrifice child
Killed process 14169 (java) total-vm:2634564kB, anon-rss:74996kB, file-rss:0kB
Docker itself doesn't have any such mechanism. All you can do is pass the --restart flag to tell Docker when it should try to bring a failed container back.
However, most places where you want to keep a container up you'll want something more complex than the --restart flag anyway. Once you're using runit or systemd to manage your containers it's easy to add in a little extra shell code to figure out why the last invocation crashed and take some special actions based on that.

Memory used but i can't see process that used it (Debian)

Here is my problem:
top - 11:32:47 up 22:20, 2 users, load average: 0.03, 0.72, 1.27
Tasks: 112 total, 1 running, 110 sleeping, 1 stopped, 0 zombie
Cpu(s): 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 8193844k total, 7508292k used, 685552k free, 80636k buffers
Swap: 2102456k total, 15472k used, 2086984k free, 7070220k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
28555 root 20 0 57424 38m 1492 S 0 0.5 0:06.38 bash
28900 root 20 0 39488 7732 3176 T 0 0.1 0:03.12 python
28553 root 20 0 72132 5052 2600 S 0 0.1 0:00.22 sshd
28859 root 20 0 70588 3424 2584 S 0 0.0 0:00.06 sshd
29404 root 20 0 70448 3320 2600 S 0 0.0 0:00.06 sshd
28863 root 20 0 42624 2188 1472 S 0 0.0 0:00.02 sftp-server
29406 root 20 0 19176 1984 1424 S 0 0.0 0:00.00 bash
2854 root 20 0 115m 1760 488 S 0 0.0 5:37.02 rsyslogd
29410 root 20 0 19064 1400 1016 R 0 0.0 0:05.14 top
3111 ntp 20 0 22484 604 460 S 0 0.0 10:26.79 ntpd
3134 proftpd 20 0 64344 452 280 S 0 0.0 6:29.16 proftpd
2892 root 20 0 49168 356 232 S 0 0.0 0:31.58 sshd
1 root 20 0 27388 284 132 S 0 0.0 0:01.38 init
3121 root 20 0 4308 248 172 S 0 0.0 0:16.48 mdadm
As you can see 7.5 GB of memory is used, but there is no process that use it.
How it can be, and how to fix this?
Thanks for answer.
www.linuxatemyram.com
It's too good of a site to ruin by copy/pasting the entire contents here.
in order to see all process you can use that command:
ps aux
and then try to sort with different filters
ps faux
Hope that helps.
If your system starts using the swap file - then you have high memory load. Depends on the file system, programs that you use - linux system may allocate all of your system memory - but that doesn't mean that they are using it.
Lots of ubuntu and debian servers that we use have free memory 32 or 64 mb but don't use swap.
I'm not Linux-gure however, so please someone to correct me if I'm wrong :)
I don't have a Linux box handy to experiment, but it looks like you can sort top's output with interactive commands, so you could bring the biggest memory users to the top. Check the man page and experiment.
Update: In the version of top I have (procps 3.2.7), you can hit "<" and ">" to change the field it's sorting by. Doesn't actually say what field it is, you have to look at how the display is changing. It's not hard once you experiment a little.
However, Arrowmaster's point (that it's probably being used for cache) is a better answer. Use "free" to see how much is being used.
I had a similar problem. I was running Raspbian on a Pi B+ with a TP-Link USB Wireless LAN stick connected. The stick caused a problem which resulted in nearly all memory being consumed on system start (around 430 of 445 MB). Just like in your case, the running processes did not consume that much memory. When I removed the stick and rebooted everything was fine, just 50 MB memory consumption.

Resources