How to distribute the records over mnesia fragments? - erlang

I have two erlang MNESIA nodes running in the cluster.
I have created table by the below properties.
mnesia:create_table(vmq_offline_store,[
{frag_properties,[
{node_pool,[node()|nodes()]},
{hash_module,verneDB_frag_hash},
{n_fragments,8},
{n_disc_only_copies,length([node()|nodes()])}]
},
{index,[]},{type, bag},
{attributes,record_info(fields,vmq_offline_store)}]).
I could see all the 8 fragments created on the two erlang nodes.
After this,I inserted 50000 records into the table using RPC call from external node.These 50000 records inserted into only vmq_offline_store. Not distributed over all the fragments.
vmq_offline_store: with 50000 records occupying 2096701142 bytes on disc
vmq_offline_store_frag2: with 0 records occupying 5464 bytes on disc
vmq_offline_store_frag3: with 0 records occupying 5464 bytes on disc
vmq_offline_store_frag4: with 0 records occupying 5464 bytes on disc
vmq_offline_store_frag5: with 0 records occupying 5464 bytes on disc
vmq_offline_store_frag6: with 0 records occupying 5464 bytes on disc
vmq_offline_store_frag7: with 0 records occupying 5464 bytes on disc
vmq_offline_store_frag8: with 0 records occupying 5464 bytes on disc
Could you please help me how to distribute the records over the fragments?

It's not enough to create the Mnesia table with fragmentation properties. Every table operation must explicitly specify the "access module" for fragmented tables, mnesia_frag. This is done by calling the function mnesia:activity/4, instead of calling mnesia:transaction/1 or using dirty operations.
For example, this code:
Fun = fun() -> ... end,
{atomic, Result} = mnesia:transaction(Fun),
becomes:
Fun = fun() -> ... end,
Result = mnesia:activity(transaction, Fun, [], mnesia_frag),
(Note that on errors mnesia:activity signals an error instead of returning {aborted, Reason}.)
For dirty operations, code like this:
mnesia:dirty_write(MyRecord)
becomes:
mnesia:activity(sync_dirty, mnesia, write, [MyRecord], mnesia_frag)
or alternatively:
mnesia:activity(sync_dirty, fun() -> mnesia:write(MyRecord) end, [],
mnesia_frag)
That is, never use the mnesia:dirty_* functions; use the "bare" ones within a dirty activity.

Related

Get or calculate Mnesia total size

I want to find the total size of a mnesia database. I have only one node.
Can I get the size of mnesia from some function or can I calculate it somehow?
I have looked in the documentation http://erlang.org/doc/man/mnesia.html, but I cannot find a fucntion to get such information for the whole database.
Do I need to calculate it per table using table_info/2? And if so how?
NOTE: I don't know how to do that with the current data points, the size is 2 (for testing I have only 2 entries) and memory is 348.
You need to iterate over all the tables with mnesia:system_info(tables) and read each table memory with mnesia:table_info(Table, memory) to obtain the number of words occupied by your table. To transform that value to bytes, you can first use erlang:system_info(wordsize) to get the word size in bytes for your machine architecture(on a 32-bit system a word is 4 bytes and 64 bits it's 8 bytes) and multiply it by your Mnesia table memory. A rough implementation:
%% Obtain the memory in of bytes of all the mnesia tables.
-spec get_mnesia_memory() -> MemInBytes :: number().
get_mnesia_memory() ->
WordSize = erlang:system_info(wordsize),
CollectMem = fun(Tbl, Acc) ->
Mem = mnesia:table_info(Tbl, memory) * WordSize,
Acc + Memory
end,
lists:foldl(CollectMem, 0, mnesia:system_info(tables)).

Size in MB of mnesia table

How do you read the :mnesia.info?
For example I only have one table, some_table, and :mnesia.info returns me this.
---> Processes holding locks <---
---> Processes waiting for locks <---
---> Participant transactions <---
---> Coordinator transactions <---
---> Uncertain transactions <---
---> Active tables <---
some_table: with 16020 records occupying 433455 words of mem
schema : with 2 records occupying 536 words of mem
===> System info in version "4.15.5", debug level = none <===
opt_disc. Directory "/home/ubuntu/project/Mnesia.nonode#nohost" is NOT used.
use fallback at restart = false
running db nodes = [nonode#nohost]
stopped db nodes = []
master node tables = []
remote = []
ram_copies = ['some_table',schema]
disc_copies = []
disc_only_copies = []
[{nonode#nohost,ram_copies}] = [schema,'some_table']
488017 transactions committed, 0 aborted, 0 restarted, 0 logged to disc
0 held locks, 0 in queue; 0 local transactions, 0 remote
0 transactions waits for other nodes: []
Also calling:
:mnesia.table_info("some_table", :size)
It returns me 16020 which I think is the number of keys, but how can I get the memory usage?
First, you need mnesia:table_info(Table, memory) to obtain the number of words occupied by your table, in your example you are getting the number of items in the table, not the memory. To transform that value to MB, you can first use erlang:system_info(wordsize) to get the word size in bytes for your machine architecture(on a 32 bit system a word is 4 bytes and 64 bits it's 8 bytes), multiply it by your Mnesia table memory to obtain the size in bytes and finally transform the value to MegaBytes like:
MnesiaMemoryMB = (mnesia:table_info("some_table", memory) * erlang:system_info(wordsize)) / (1024*1024).
You can use erlang:system_info(wordsize) to get the word size in bytes, on a 32 bit system a word is 32 bits or 4 bytes, on 64 bit it's 8 bytes. So your table is using 433455 x wordsize.

Apache Ignite use too much RAM

I've tried to use Ignite to store events, but face a problem of too much RAM usage during inserting new data
I'm runing ignite node with 1GB Heap and default configuration
curs.execute("""CREATE TABLE trololo (id LONG PRIMARY KEY, user_id LONG, event_type INT, timestamp TIMESTAMP) WITH "template=replicated" """);
n = 10000
for i in range(200):
values = []
for j in range(n):
id_ = i * n + j
event_type = random.randint(1, 5)
user_id = random.randint(1000, 5000)
timestamp = datetime.datetime.utcnow() - timedelta(hours=random.randint(1, 100))
values.append("({id}, {user_id}, {event_type}, '{timestamp}')".format(
id=id_, user_id=user_id, event_type=event_type, uid=uid, timestamp=timestamp.strftime('%Y-%m-%dT%H:%M:%S-00:00')
))
query = "INSERT INTO trololo (id, user_id, event_type, TIMESTAMP) VALUES %s;" % ",".join(values)
curs.execute(query)
But after loading about 10^6 events, I got 100% CPU usage because all heap are taken and GC trying to clean some space (unsuccessfully)
Then I stop for about 10 minutes and after that GC succesfully clean some space and I could continue loading new data
Then again heap fully loaded and all over again
It's really strange behaviour and I couldn't find a way how I could load 10^7 events without those problems
aproximately event should take:
8 + 8 + 4 + 10(timestamp size?) is about 30 bytes
30 bytes x3 (overhead) so it should be less than 100bytes per record
So 10^7 * 10^2 = 10^9 bytes = 1Gb
So it seems that 10^7 events should fit into 1Gb RAM, isn't it?
Actually, since version 2.0, Ignite stores all in offheap with default settings.
The main problem here is that you generate a very big query string with 10000 inserts, that should be parsed and, of course, will be stored in heap. After decreasing this size for each query, you will get better results here.
But also, as you can see in doc for capacity planning, Ignite adds around 200 bytes overhead for each entry. Additionally, add around 200-300MB per node for internal memory and reasonable amount of memory for JVM and GC to operate efficiently
If you really want to use only 1gb heap you can try to tune GC, but I would recommend increasing heap size.

Mnesia: always suffix fragmented table fragments?

When I create a fragmented table in Mnesia, all of the table fragments will have the suffix _fragN except for the first fragment. This is error-prone, since any code that accesses the table without specifying the correct access module will appear to work, since it reads from and writes to the first fragment, but it will not mix with code using the correct access module, since they will be looking for elements in different places.
Is there a way to tell Mnesia to use a fragment suffix for all table fragments? That would avoid that problem, by making incorrect accesses fail noisily.
For example, if I create a table with four fragments:
1> mnesia:start().
ok
2> mnesia:create_table(foo, [{frag_properties, [{node_pool, [node()]}, {n_fragments, 4}]}]).
{atomic,ok}
then mnesia:info/0 will list the fragments as foo, foo_frag2, foo_frag3 and foo_frag4:
3> mnesia:info().
---> Processes holding locks <---
---> Processes waiting for locks <---
---> Participant transactions <---
---> Coordinator transactions <---
---> Uncertain transactions <---
---> Active tables <---
foo : with 0 records occupying 304 words of mem
foo_frag2 : with 0 records occupying 304 words of mem
foo_frag3 : with 0 records occupying 304 words of mem
foo_frag4 : with 0 records occupying 304 words of mem
schema : with 5 records occupying 950 words of mem
===> System info in version "4.14", debug level = none <===
opt_disc. Directory "/Users/legoscia/Mnesia.nonode#nohost" is NOT used.
use fallback at restart = false
running db nodes = [nonode#nohost]
stopped db nodes = []
master node tables = []
remote = []
ram_copies = [foo,foo_frag2,foo_frag3,foo_frag4,schema]
disc_copies = []
disc_only_copies = []
[{nonode#nohost,ram_copies}] = [schema,foo_frag4,foo_frag3,foo_frag2,foo]
3 transactions committed, 0 aborted, 0 restarted, 0 logged to disc
0 held locks, 0 in queue; 0 local transactions, 0 remote
0 transactions waits for other nodes: []
I'd want foo to be foo_frag1 instead. Is that possible?

retrieval of data from ETS table

I know that lookup time is constant for ETS tables. But I also heard that the table is kept outside of the process and when retrieving data, it needs to be moved to the process heap. So, this is expensive. But then, how to explain this:
18> {Time, [[{ok, Binary}]]} = timer:tc(ets, match, [utilo, {a, '$1'}]).
{0,
[[{ok,<<255,216,255,225,63,254,69,120,105,102,0,0,73,
73,42,0,8,0,0,0,10,0,14,...>>}]]}
19> size(Binary).
1759017
1.7 MB binary takes 0 time to be retrieved from the table!?
EDIT: After I saw Odobenus Rosmarus's answer, I decided to convert the binary to list. Here is the result:
1> {ok, B} = file:read_file("IMG_2171.JPG").
{ok,<<255,216,255,225,63,254,69,120,105,102,0,0,73,73,42,
0,8,0,0,0,10,0,14,1,2,0,32,...>>}
2> size(B).
1986392
3> L = binary_to_list(B).
[255,216,255,225,63,254,69,120,105,102,0,0,73,73,42,0,8,0,0,
0,10,0,14,1,2,0,32,0,0|...]
4> length(L).
1986392
5> ets:insert(utilo, {a, L}).
true
6> timer:tc(ets, match, [utilo, {a, '$1'}]).
{106000,
[[[255,216,255,225,63,254,69,120,105,102,0,0,73,73,42,0,8,0,
0,0,10,0,14,1,2|...]]]}
Now it takes 106000 microseconds to retrieve 1986392 long list from the table which is pretty fast, isn't it? Lists are 2 words per element. Thus the data is 4x1.7MB.
EDIT 2: I started a thread on erlang-question (http://groups.google.com/group/erlang-programming/browse_thread/thread/5581a8b5b27d4fe1) and it turns out that 0.1 second is pretty much the time it takes to do memcpy() (move the data to the process's heap). On the other hand Odobenus Rosmarus's answer explains why retrieving binary takes 0 time.
binaries itself (that longer than 64 bits) are stored in the special heap, outside of process heap.
So, retrieval of binary from the ets table moves to process heap just 'Procbin' part of binary. (roughly it's pointer to start of binary in the binaries memory and size).

Resources