What is causing an extremely low memory fragmentation ratio? - memory

We are seeing some odd memory issues with our Redis 4.0.2 instances. The master instance has a ratio of 0.12, whereas the slaves have reasonable ratios that hover just above 1. When we restart the master instance, the memory fragmentation ratio goes back to 1 until we hit our peak load times and the ratio goes back down to less than 0.2. The OS (Ubuntu) is telling us that the redis instance is using 13GB of virtual memory and 1.6GB of RAM. And once this happens, most of the data gets swapped out to disk and the performance grinds almost to a halt.
Our keys tend to last for a day or two before being purged. Most values are hashes and zsets with ~100 or so entries and each entry being less than 1kb or so.
We are not sure what is causing this. We have tried tweaking the OS overcommit_memory. We also tried the new MEMORY PURGE command, but that neither seem to help. We are looking for other things to explore and suggestions to try. Any advice would be appreciated.
What is the likely cause of this and how can we bring the ratio back closer to 1?
Here is a dump of our memory info:
127.0.0.1:8000> info memory
# Memory
used_memory:12955019496
used_memory_human:12.07G
used_memory_rss:1676115968
used_memory_rss_human:1.56G
used_memory_peak:12955019496
used_memory_peak_human:12.07G
used_memory_peak_perc:100.00%
used_memory_overhead:19789422
used_memory_startup:765600
used_memory_dataset:12935230074
used_memory_dataset_perc:99.85%
total_system_memory:33611145216
total_system_memory_human:31.30G
used_memory_lua:945152
used_memory_lua_human:923.00K
maxmemory:0
maxmemory_human:0B
maxmemory_policy:noeviction
mem_fragmentation_ratio:0.13
mem_allocator:jemalloc-4.0.3
active_defrag_running:0
lazyfree_pending_objects:0
And our memory stats:
127.0.0.1:8000> memory stats
1) "peak.allocated"
2) (integer) 12954706848
3) "total.allocated"
4) (integer) 12954623968
5) "startup.allocated"
6) (integer) 765600
7) "replication.backlog"
8) (integer) 1048576
9) "clients.slaves"
10) (integer) 33716
11) "clients.normal"
12) (integer) 184494
13) "aof.buffer"
14) (integer) 0
15) "db.0"
16) 1) "overhead.hashtable.main"
2) (integer) 17691184
3) "overhead.hashtable.expires"
4) (integer) 32440
17) "overhead.total"
18) (integer) 19756010
19) "keys.count"
20) (integer) 337422
21) "keys.bytes-per-key"
22) (integer) 38390
23) "dataset.bytes"
24) (integer) 12934867958
25) "dataset.percentage"
26) "99.853401184082031"
27) "peak.percentage"
28) "99.999359130859375"
29) "fragmentation"
30) "0.12932859361171722"

Related

The memory of cgroup rss is much higher than the summary of the memory usage of all processes in the docker container

I hava a Redis runing in a container .
Inside the container cgroup rss show using about 1283MB memory.
The kmem memory usage is 30.75MB.
The summary of the memory usage of all processes in the docker container is 883MB.
How can i figure out the "disappeared memory "(1296-883-30=383MB).The "disappeared memory" will growing with the time pass.Flinally the container will be oom killed .
environmet info is
redis version:4.0.1
docker version:18.09.9
k8s version:1.13
**the memory usage is 1283MB **
root#redis-m-s-rbac-0:/opt#cat /sys/fs/cgroup/memory/memory.usage_in_bytes
1346289664 >>>> 1283.921875 MB
the kmem memory usage is 30.75MB
root#redis-m-s-rbac-0:/opt#cat /sys/fs/cgroup/memory/memory.kmem.usage_in_bytes
32194560 >>> 30.703125 MB
root#redis-m-s-rbac-0:/opt#cat /sys/fs/cgroup/memory/memory.stat
cache 3358720
rss 1359073280 >>> 1296.11328125 MB
rss_huge 515899392
shmem 0
mapped_file 405504
dirty 0
writeback 0
swap 0
pgpgin 11355630
pgpgout 11148885
pgfault 25710366
pgmajfault 0
inactive_anon 0
active_anon 1359245312
inactive_file 2351104
active_file 1966080
unevictable 0
hierarchical_memory_limit 4294967296
hierarchical_memsw_limit 4294967296
total_cache 3358720
total_rss 1359073280
total_rss_huge 515899392
total_shmem 0
total_mapped_file 405504
total_dirty 0
total_writeback 0
total_swap 0
total_pgpgin 11355630
total_pgpgout 11148885
total_pgfault 25710366
total_pgmajfault 0
total_inactive_anon 0
total_active_anon 1359245312
total_inactive_file 2351104
total_active_file 1966080
total_unevictable 0
**the summary of the memory usage of all processes in the docker container is 883MB **
root#redis-m-s-rbac-0:/opt#ps aux | awk '{sum+=$6} END {print sum / 1024}'
883.609
This is happening because usage_in_bytes does not show exact value of memory and swap usage. The memory.usage_in_bytes show current memory(RSS+Cache) usage.
5.5 usage_in_bytes For efficiency, as other kernel components, memory cgroup uses some optimization to avoid unnecessary cacheline
false sharing. usage_in_bytes is affected by the method and doesn't
show 'exact' value of memory (and swap) usage, it's a fuzz value for
efficient access. (Of course, when necessary, it's synchronized.) If
you want to know more exact memory usage, you should use
RSS+CACHE(+SWAP) value in memory.stat(see 5.2).
Reference:
https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt

How to calculate memory usage as a percent from docker api i.e /containers/{id}/stats

When using the docker API /containers/{id}/stats I am able to get memory usage from the json file under memory stats.
"memory_stats": {
"stats": {},
"max_usage": 6651904,
"usage": 6537216,
"failcnt": 0,
"limit": 67108864
},
The question is, how do you calculate the percent memory usage for the container from this? Have googled for any documentation on this but not able to get any.
This won't give the same answer as fetched by the Docker stats command.
Docker uses multiple combinations to arrive at MEM%, according to my calculations you should take into consideration the cache and active memory as well
The formula would look something like:
def get_mem_perc(stats):
mem_used = stats["memory_stats"]["usage"] - stats["memory_stats"]["stats"]["cache"] + stats["memory_stats"]["stats"]["active_file"]
limit = stats['memory_stats']['limit']
return round(mem_used / limit * 100, 2)
Where stats are the stats value for the entire container
docker is open source, which means we can look at the source code :)
Just parse through the json object and divide usage by the limit. Keep in mind, that the limit is the limit of the entire host machine, and exceeding this will result in an OOM kill.
This is also what's displayed when running docker stats and looking at MEM USAGE / LIMIT and MEM %
return float(json_object['memory_stats']['usage']) / float(json_object['memory_stats']['limit']) * 100)
Confirmed by looking at the open source repo here

Apache Ignite use too much RAM

I've tried to use Ignite to store events, but face a problem of too much RAM usage during inserting new data
I'm runing ignite node with 1GB Heap and default configuration
curs.execute("""CREATE TABLE trololo (id LONG PRIMARY KEY, user_id LONG, event_type INT, timestamp TIMESTAMP) WITH "template=replicated" """);
n = 10000
for i in range(200):
values = []
for j in range(n):
id_ = i * n + j
event_type = random.randint(1, 5)
user_id = random.randint(1000, 5000)
timestamp = datetime.datetime.utcnow() - timedelta(hours=random.randint(1, 100))
values.append("({id}, {user_id}, {event_type}, '{timestamp}')".format(
id=id_, user_id=user_id, event_type=event_type, uid=uid, timestamp=timestamp.strftime('%Y-%m-%dT%H:%M:%S-00:00')
))
query = "INSERT INTO trololo (id, user_id, event_type, TIMESTAMP) VALUES %s;" % ",".join(values)
curs.execute(query)
But after loading about 10^6 events, I got 100% CPU usage because all heap are taken and GC trying to clean some space (unsuccessfully)
Then I stop for about 10 minutes and after that GC succesfully clean some space and I could continue loading new data
Then again heap fully loaded and all over again
It's really strange behaviour and I couldn't find a way how I could load 10^7 events without those problems
aproximately event should take:
8 + 8 + 4 + 10(timestamp size?) is about 30 bytes
30 bytes x3 (overhead) so it should be less than 100bytes per record
So 10^7 * 10^2 = 10^9 bytes = 1Gb
So it seems that 10^7 events should fit into 1Gb RAM, isn't it?
Actually, since version 2.0, Ignite stores all in offheap with default settings.
The main problem here is that you generate a very big query string with 10000 inserts, that should be parsed and, of course, will be stored in heap. After decreasing this size for each query, you will get better results here.
But also, as you can see in doc for capacity planning, Ignite adds around 200 bytes overhead for each entry. Additionally, add around 200-300MB per node for internal memory and reasonable amount of memory for JVM and GC to operate efficiently
If you really want to use only 1gb heap you can try to tune GC, but I would recommend increasing heap size.

What is the reason of redis memory loss?

Nobody flush DB, only do hget after start and hset all the Data.
But after sometime(about 1 day) , the memory will loss and no data exists...
# Memory
used_memory:817064
used_memory_human:797.91K
used_memory_rss:52539392
used_memory_peak:33308069304
used_memory_peak_human:31.02G
used_memory_lua:36864
mem_fragmentation_ratio:64.30
mem_allocator:jemalloc-3.6.0
After I restart the server and hset all again , the used_memory recover back.
# Memory
used_memory:33291293520
used_memory_human:31.00G
used_memory_rss:33526530048
used_memory_peak:33291293520
used_memory_peak_human:31.00G
used_memory_lua:36864
mem_fragmentation_ratio:1.01
mem_allocator:jemalloc-3.6.0
But it can never last longer than 1 day...The hset process need at least 4h, and redis takes up over half of the memory so BGSAVE is useless...
What is the reason of memory loss? and how 2 backup data?

Memory Management in Ruby

I puzzled by some behaviour of ruby and how it manages memory.
I understand the Ruby GC (major or minor) behaviour i.e if the any objects count goes above there threshold value or limit (i.e heap_available_slots,old_objects_limit, remembered_shady_object_limit, malloc_limit). Ruby run/trigger a GC(major or minor).
And after GC if it can't find the enough memory Ruby allocate (basically malloc I assuming) more memory for the running program.
Also, It's a known fact that by does release back memory to the OS immediately.
Now ..
What I fail to understand how come Ruby releases memory (back to the OS) without triggering any GC.
Example
require 'rbtrace'
index = 1
array = []
while(index < 20000000) do
array << index
index += 1
end
sleep 10
print "-"
array=nil
sleep
Here is my example. If run the above code on ruby 2.2.2p95.
htop display the RSS count of the process (test.rb PID 11483) reaching to 161MB.
GC.stat (captured via rbtrace gem) look like (pay close attention to attention to GC count)
rbtrace -p 11843 -e '[Time.now,Process.pid,GC.stat]'
[Time.now,Process.pid,GC.stat]
=> [2016-07-27 13:50:28 +0530, 11843,
{
"count": 7,
"heap_allocated_pages": 74,
"heap_sorted_length": 75,
"heap_allocatable_pages": 0,
"heap_available_slots": 30162,
"heap_live_slots": 11479,
"heap_free_slots": 18594,
"heap_final_slots": 89,
"heap_marked_slots": 120,
"heap_swept_slots": 18847,
"heap_eden_pages": 74,
"heap_tomb_pages": 0,
"total_allocated_pages": 74,
"total_freed_pages": 0,
"total_allocated_objects": 66182,
"total_freed_objects": 54614,
"malloc_increase_bytes": 8368,
"malloc_increase_bytes_limit": 33554432,
"minor_gc_count": 4,
"major_gc_count": 3,
"remembered_wb_unprotected_objects": 0,
"remembered_wb_unprotected_objects_limit": 278,
"old_objects": 14,
"old_objects_limit": 10766,
"oldmalloc_increase_bytes": 198674592,
"oldmalloc_increase_bytes_limit": 20132659
}]
*** detached from process 11843
GC count => 7
Approximately 25 minutes later. Memory has drop down to 6MB but GC count is still 7.
[Time.now,Process.pid,GC.stat]
=> [2016-07-27 14:16:02 +0530, 11843,
{
"count": 7,
"heap_allocated_pages": 74,
"heap_sorted_length": 75,
"heap_allocatable_pages": 0,
"heap_available_slots": 30162,
"heap_live_slots": 11581,
"heap_free_slots": 18581,
"heap_final_slots": 0,
"heap_marked_slots": 120,
"heap_swept_slots": 18936,
"heap_eden_pages": 74,
"heap_tomb_pages": 0,
"total_allocated_pages": 74,
"total_freed_pages": 0,
"total_allocated_objects": 66284,
"total_freed_objects": 54703,
"malloc_increase_bytes": 3248,
"malloc_increase_bytes_limit": 33554432,
"minor_gc_count": 4,
"major_gc_count": 3,
"remembered_wb_unprotected_objects": 0,
"remembered_wb_unprotected_objects_limit": 278,
"old_objects": 14,
"old_objects_limit": 10766,
"oldmalloc_increase_bytes": 198663520,
"oldmalloc_increase_bytes_limit": 20132659
}]
Question: I was under the impression that Ruby Release memory whenever GC is triggered. But clearly that not the case over here.
Anybody can provide a detail on how (as in who triggered the memory releases surely its not GC.) the memory is released back to OS.
OS: OS X version 10.11.12
You are correct, it's not GC that changed the physical memory requirements, it's the OS kernel.
You need to look at the VIRT column, not the RES column. As you can see VIRT stays exactly the same.
RES is physical (resident) memory, VIRT is virtual (allocated, but currently unused) memory.
When the process sleeps it's not using its memory or doing anything, so the OS memory manager decides to swap out part of the physical memory and move it into virtual space.
Why keep an idle process hogging physical memory for no reason? So the OS is smart, and swaps out as much unused physical memory as possible, that's why you see a reduction in RES.
I suspect you would see the same effect even without array = nil, by just sleeping long enough. Once you stop sleeping and access something in the array, then RES will jump back up again.
You can read some more discussion through these:
What is RSS and VSZ in Linux memory management
http://www.darkcoding.net/software/resident-and-virtual-memory-on-linux-a-short-example/
What's the difference between "virtual memory" and "swap space"?
http://www.tldp.org/LDP/tlk/mm/memory.html

Resources