All allocated Tarantool 2.3 space memtx is occupied - lua

Today the allocated space for memxt tarantool is over - memtx_memory = 5GB, the RAM was really busy at 5GB, after restarting tarantool more than 4GB was freed.
What could be clogged with RAM? What settings can this be related to?
box.slab.info()
---
- items_size: 1308568936
items_used_ratio: 91.21%
quota_size: 5737418240
quota_used_ratio: 13.44%
arena_used_ratio: 89.2%
items_used: 1193572600
quota_used: 1442840576
arena_size: 1442840576
arena_used: 1287551224
box.info()
---
- version: 2.3.2-26-g38e825b
id: 1
ro: false
uuid: d9cb7d78-1277-4f83-91dd-9372a763aafa
package: Tarantool
cluster:
uuid: b6c32d07-b448-47df-8967-40461a858c6d
replication:
1:
id: 1
uuid: d9cb7d78-1277-4f83-91dd-9372a763aafa
lsn: 89759968433
2:
id: 2
uuid: 77557306-8e7e-4bab-adb1-9737186bd3fa
lsn: 9
3:
id: 3
uuid: 28bae7dd-26a8-47a7-8587-5c1479c62311
lsn: 0
4:
id: 4
uuid: 6a09c191-c987-43a4-8e69-51da10cc3ff2
lsn: 0
signature: 89759968442
status: running
vinyl: []
uptime: 606297
lsn: 89759968433
sql: []
gc: []
pid: 32274
memory: []
vclock: {2: 9, 1: 89759968433}
cat /etc/tarantool/instances.available/my_app.lua
...
memtx_memory = 5 * 1024 * 1024 * 1024,
...
Tarantool vesrion 2.3.2, OS CentOs 7
https://i.stack.imgur.com/onV44.png

It's the result of a process called fragmentation.
The simple reason for this process is the next situation:
you have some allocated area for tuples
you put one tuple and next you put another one
when you need to increase the first tuple, a database needs to relocate your tuple at another place with enough capacity. After that, the place for the first tuple will be free but we took the new place for the extended tuple.
You can decrease a fragmentation factor by increasing a tuple size for your case.
Choose the size by estimating your typical data or just find the optimal size via metrics of your workload for a time.

Related

Size in MB of mnesia table

How do you read the :mnesia.info?
For example I only have one table, some_table, and :mnesia.info returns me this.
---> Processes holding locks <---
---> Processes waiting for locks <---
---> Participant transactions <---
---> Coordinator transactions <---
---> Uncertain transactions <---
---> Active tables <---
some_table: with 16020 records occupying 433455 words of mem
schema : with 2 records occupying 536 words of mem
===> System info in version "4.15.5", debug level = none <===
opt_disc. Directory "/home/ubuntu/project/Mnesia.nonode#nohost" is NOT used.
use fallback at restart = false
running db nodes = [nonode#nohost]
stopped db nodes = []
master node tables = []
remote = []
ram_copies = ['some_table',schema]
disc_copies = []
disc_only_copies = []
[{nonode#nohost,ram_copies}] = [schema,'some_table']
488017 transactions committed, 0 aborted, 0 restarted, 0 logged to disc
0 held locks, 0 in queue; 0 local transactions, 0 remote
0 transactions waits for other nodes: []
Also calling:
:mnesia.table_info("some_table", :size)
It returns me 16020 which I think is the number of keys, but how can I get the memory usage?
First, you need mnesia:table_info(Table, memory) to obtain the number of words occupied by your table, in your example you are getting the number of items in the table, not the memory. To transform that value to MB, you can first use erlang:system_info(wordsize) to get the word size in bytes for your machine architecture(on a 32 bit system a word is 4 bytes and 64 bits it's 8 bytes), multiply it by your Mnesia table memory to obtain the size in bytes and finally transform the value to MegaBytes like:
MnesiaMemoryMB = (mnesia:table_info("some_table", memory) * erlang:system_info(wordsize)) / (1024*1024).
You can use erlang:system_info(wordsize) to get the word size in bytes, on a 32 bit system a word is 32 bits or 4 bytes, on 64 bit it's 8 bytes. So your table is using 433455 x wordsize.

Apache Ignite use too much RAM

I've tried to use Ignite to store events, but face a problem of too much RAM usage during inserting new data
I'm runing ignite node with 1GB Heap and default configuration
curs.execute("""CREATE TABLE trololo (id LONG PRIMARY KEY, user_id LONG, event_type INT, timestamp TIMESTAMP) WITH "template=replicated" """);
n = 10000
for i in range(200):
values = []
for j in range(n):
id_ = i * n + j
event_type = random.randint(1, 5)
user_id = random.randint(1000, 5000)
timestamp = datetime.datetime.utcnow() - timedelta(hours=random.randint(1, 100))
values.append("({id}, {user_id}, {event_type}, '{timestamp}')".format(
id=id_, user_id=user_id, event_type=event_type, uid=uid, timestamp=timestamp.strftime('%Y-%m-%dT%H:%M:%S-00:00')
))
query = "INSERT INTO trololo (id, user_id, event_type, TIMESTAMP) VALUES %s;" % ",".join(values)
curs.execute(query)
But after loading about 10^6 events, I got 100% CPU usage because all heap are taken and GC trying to clean some space (unsuccessfully)
Then I stop for about 10 minutes and after that GC succesfully clean some space and I could continue loading new data
Then again heap fully loaded and all over again
It's really strange behaviour and I couldn't find a way how I could load 10^7 events without those problems
aproximately event should take:
8 + 8 + 4 + 10(timestamp size?) is about 30 bytes
30 bytes x3 (overhead) so it should be less than 100bytes per record
So 10^7 * 10^2 = 10^9 bytes = 1Gb
So it seems that 10^7 events should fit into 1Gb RAM, isn't it?
Actually, since version 2.0, Ignite stores all in offheap with default settings.
The main problem here is that you generate a very big query string with 10000 inserts, that should be parsed and, of course, will be stored in heap. After decreasing this size for each query, you will get better results here.
But also, as you can see in doc for capacity planning, Ignite adds around 200 bytes overhead for each entry. Additionally, add around 200-300MB per node for internal memory and reasonable amount of memory for JVM and GC to operate efficiently
If you really want to use only 1gb heap you can try to tune GC, but I would recommend increasing heap size.

Single-threaded program profiles 15% of runtime in semaphore_wait_trap

On Mac OS using mono, if I compile and profile the program below, I get the following results:
% fsharpc --nologo -g foo.fs -o foo.exe
% mono --profile=default:stat foo.exe
...
Statistical samples summary
Sample type: cycles
Unmanaged hits: 336 (49.1%)
Managed hits: 349 (50.9%)
Unresolved hits: 1 ( 0.1%)
Hits % Method name
154 22.48 Microsoft.FSharp.Collections.SetTreeModule:height ...
105 15.33 semaphore_wait_trap
74 10.80 Microsoft.FSharp.Collections.SetTreeModule:add ...
...
Note the second entry, semaphore_wait_trap.
Here is the program:
[<EntryPoint>]
let main args =
let s = seq { 1..1000000 } |> Set.ofSeq
s |> Seq.iter (fun _ -> ())
0
I looked in the source for the Set module, but I didn't find any (obvious) locking.
Is my single-threaded program really spending 15% of its execution time messing with semaphores? If it is, can I make it not do that and get a performance boost?
According to Instruments, it's sgen/gc calling semaphore_wait_trap:
Sgen is documented as stopping all other threads while it collects:
Before doing a collection (minor or major), the collector must stop
all running threads so that it can have a stable view of the current
state of the heap, without the other threads changing it
In other words, when the code is trying to allocate memory and a GC is required, the time it takes shows up under semaphore_wait_trap since that's your application thread. I suspect the mono profiler doesn't profile the gc thread itself so you don't see the time in the collection code.
The germane output then is really the GC summary:
GC summary
GC resizes: 0
Max heap size: 0
Object moves: 1002691
Gen0 collections: 123, max time: 14187us, total time: 354803us, average: 2884us
Gen1 collections: 3, max time: 41336us, total time: 60281us, average: 20093us
If you want your code to run faster, don't collect as often.
Understanding the actual cost of collection can be done through dtrace since sgen has dtrace probes.

erlang crash dump no more index entries

i have a problem about erlang.
One of my Erlang node crashes, and generates erl_crash.dump with reason no more index entries in atom_tab (max=1048576).
i checked the dump file and i found that there are a lot of atoms in the form of 'B\2209\000..., (about 1000000 entries)
=proc:<0.11744.7038>
State: Waiting
Name: 'B\2209\000d\022D.
Spawned as: proc_lib:init_p/5
Spawned by: <0.5032.0>
Started: Sun Feb 23 05:23:27 2014
Message queue length: 0
Number of heap fragments: 0
Heap fragment data: 0
Reductions: 1992
Stack+heap: 1597
OldHeap: 1597
Heap unused: 918
OldHeap unused: 376
Program counter: 0x0000000001eb7700 (gen_fsm:loop/7 + 140)
CP: 0x0000000000000000 (invalid)
arity = 0
do you have some experience about what they are?
Atoms
By default, the maximum number of atoms is 1048576. This limit can be raised or lowered using the +t option.
Note: an atom refers into an atom table which also consumes memory. The atom text is stored once for each unique atom in this table. The atom table is not garbage-collected.
I think that you produce a lot of atom in your program, the number of atom reach the number limition for atom.
You can use this +t option to change the number limition of atom in your erlang VM when your start your erlang node.
So it tells you, that you generate atoms. There is somewhere list_to_atom/1 which is called with variable argument. Because you have process with this sort of name, you register/2 processes with this name. It is may be your code or some third party module what you use. It's bad behavior. Don't do that and don't use modules which is doing it.
To be honest, I can imagine design where I would do it intentionally but it is very special case and it is obviously not the case when you ask this question.

AIX 7.1 Creating an enhanced JFS file system issues?

I am fairly new to AIX and am trying to create my first jfs file system mounted on /usr1.
I started off by creating my volume group from the available disks..
#/usr/sbin/mklv -y'vl_usr1' -t'jfs2' -c'2' vg_usr1 6005
van-oppy# lsvg vg_usr1
VOLUME GROUP: vg_usr1 VG IDENTIFIER: 000015010000d60000000140f464119b
VG STATE: active PP SIZE: 128 megabyte(s)
VG PERMISSION: read/write TOTAL PPs: 12012 (1537536 megabytes)
MAX LVs: 256 FREE PPs: 2 (256 megabytes)
LVs: 1 USED PPs: 12010 (1537280 megabytes)
OPEN LVs: 0 QUORUM: 12 (Enabled)
TOTAL PVs: 22 VG DESCRIPTORS: 22
STALE PVs: 0 STALE PPs: 0
ACTIVE PVs: 22 AUTO ON: yes
MAX PPs per VG: 32512
MAX PPs per PV: 1016 MAX PVs: 32
LTG size (Dynamic): 256 kilobyte(s) AUTO SYNC: no
HOT SPARE: no BB POLICY: relocatable
PV RESTRICTION: none INFINITE RETRY: noegabytes)
OPEN LVs: 2 QUORUM: 3 (Enabled)
TOTAL PVs: 4 VG DESCRIPTORS: 4
STALE PVs: 0 STALE PPs: 0
ACTIVE PVs: 4 AUTO ON: yes
MAX PPs per VG: 32768 MAX PVs: 1024
LTG size (Dynamic): 256 kilobyte(s) AUTO SYNC: no
HOT SPARE: no BB POLICY: relocatable
MIRROR POOL STRICT: off
PV RESTRICTION: none INFINITE RETRY: no
Below is the lslv of the logical volume...
# lslv vl_usr1
LOGICAL VOLUME: vl_usr1 VOLUME GROUP: vg_usr1
LV IDENTIFIER: 000015010000d60000000140f464119b.1 PERMISSION: read/write
VG STATE: active/complete LV STATE: closed/syncd
TYPE: jfs2 WRITE VERIFY: off
MAX LPs: 6005 PP SIZE: 128 megabyte(s)
COPIES: 2 SCHED POLICY: parallel
LPs: 6005 PPs: 12010
STALE PPs: 0 BB POLICY: relocatable
INTER-POLICY: minimum RELOCATABLE: yes
INTRA-POLICY: middle UPPER BOUND: 32
MOUNT POINT: N/A LABEL: None
MIRROR WRITE CONSISTENCY: on/ACTIVE
EACH LP COPY ON A SEPARATE PV ?: yes
Serialize IO ?: NO
INFINITE RETRY: no
When creating the logical volume i've set it to have 2 copies, so now my free LPs is 6005.
When I try next to create an enhanced JFS2 file system it fails..
Add an Enhanced Journaled File System
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[TOP] [Entry Fields]
Volume group name vg_usr1
SIZE of file system
Unit Size Megabytes +
* Number of units [768640] #
* MOUNT POINT [/usr1]
Mount AUTOMATICALLY at system restart? yes +
PERMISSIONS read/write +
Mount OPTIONS [] +
Block Size (bytes) 4096 +
Logical Volume for Log +
Inline Log size (MBytes) [] #
Extended Attribute Format +
ENABLE Quota Management? no +
[MORE...3]
F1=Help F2=Refresh F3=Cancel F4=List
Esc+5=Reset Esc+6=Command Esc+7=Edit Esc+8=Image
Esc+9=Shell Esc+0=Exit Enter=Do
The commit fails with..
0516-404 allocp: This system cannot fulfill the allocation request. There are not enough free partitions or not enough physical volumes to keep strictness and satisfy allocation requests. The command should be retried with different allocation characteristics.
0516-822 mklv: Unable to create logical volume.
crfs: 0506-972 Cannot create logical volume for file system.
rmlv: Logical volume loglv00 is removed.
I am confused as to why? My free PPs was 6005, so 6005 x 128 = 768640mb which is what I set during the creation. I also tried to lower the number to 768000mb which is 6000 PPs but still no go.
Any AIX experts out there able to shed some light as to why this is not working? Trying to wrap my head around how the LVM works...
I have resolved my issue, I was allocating more space than available and using wrong calculations to come up with the maximum available.

Resources