Memory - Paging and TLB - memory

I have question to the following task.
Consider an IA-32 system where the MMU supports a two level page table. The second
level contains 1024 page table entries mapping to 4 KB page frames. Each page table
entry (both levels) has a size of 4 bytes. The system only supports 4 KB page size.
We want to sequentially read consecutive 8 MB from virtual memory, starting with byte 0. We read one word at a time (4 bytes)
We have an 8 entry data TLB. How many memory accesses are needed to
read the 8 MB of memory specified above?
Does it make a difference, if the TLB has 4 entries instead of 8?
So, we read sequentially. This means 8MB/4B = 2M memory accesses. We have a two level page table. Therefore, 2M + 2*2M = 6M memory accesses without TLB.
But I don't know how to calculate the memory accesses including a TLB.
Could anyone explain me that? That would be very helpful.

Since the access pattern is a streaming access, each TLB entry will be used for one access to each four bytes for the entire page and never re-used. This means that each TLB entry will be reused 1023 times, so 1023 look-ups (2046 memory accesses) would be avoided per page. (Since there is no overlap of use of different translations and only perfectly localized reuse, a single entry data TLB would have equivalent performance to even a 2048-entry TLB.)
Consider the following description of what is happening for a two-entry direct-mapped data TLB (recognizing that the least significant 12 bits of the virtual address—the offset within the page—are ignored for the TLB and one bit of the virtual address is used to index into the TLB):
load 0x0100_0000; // TLB entry 0 tag != 0x0800 (page # 0x0_1000) [miss]
// 2 memory accesses to fill TLB entry 0
load 0x0100_0004; // TLB entry 0 tag == 0x0800 [hit]
load 0x0100_0008; // TLB entry 0 tag == 0x0800 [hit]
... // 1020 TLB hits in TLB entry 0
load 0x0100_0ffc; // TLB entry 0 tag == 0x0800 [hit]; last word in page
load 0x0100_1000; // TLB entry 1 tag != 0x0800 (page # 0x0_1001) [miss]
// 2 memory accesses to fill TLB entry 1
load 0x0100_1004; // TLB entry 1 tag == 0x0800 [hit]
load 0x0100_1008; // TLB entry 1 tag == 0x0800 [hit]
... // 1020 TLB hits in TLB entry 1
load 0x0100_1ffc; // TLB entry 1 tag == 0x0800 [hit]; last word in page
load 0x0100_2000; // TLB entry 0 tag (0x0800) != 0x0801 (page # 0x0_1002) [miss]
// 2 memory accesses to fill TLB entry 0
load 0x0100_2004; // TLB entry 0 tag == 0x0801 [hit]
load 0x0100_2008; // TLB entry 0 tag == 0x0801 [hit]
... // 1020 TLB hits in TLB entry 0
load 0x0100_2ffc; // TLB entry 0 tag == 0x0801 [hit]; last word in page
load 0x0100_3000; // TLB entry 1 tag (0x0800) != 0x0801 (page # 0x0_1003) [miss)
// 2 memory accesses to fill TLB entry 1
load 0x0100_3004; // TLB entry 1 tag == 0x0801 [hit]
load 0x0100_3008; // TLB entry 1 tag == 0x0801 [hit]
... // 1020 TLB hits in TLB entry 1
load 0x0100_3ffc; // TLB entry 1 tag == 0x0801 [hit]; last word in page
... // repeat the above 510 times
// then the last 4 pages of the 8 MiB stream
load 0x017f_c000; // TLB entry 0 tag (0x0bfd) != 0x0bfe (page # 0x0_17fc) [miss]
// 2 memory accesses to fill TLB entry 0
load 0x017f_c004; // TLB entry 0 tag == 0x0bfe [hit]
load 0x017f_c008; // TLB entry 0 tag == 0x0bfe [hit]
... // 1020 TLB hits in TLB entry 0
load 0x017f_cffc; // TLB entry 0 tag == 0x0bfe [hit]; last word in page
load 0x017f_d000; // TLB entry 1 tag (0x0bfd) != 0x0bfe (page # 0x0_17fd) [miss]
// 2 memory accesses to fill TLB entry 1
load 0x017f_d004; // TLB entry 1 tag == 0x0bfe [hit]
load 0x017f_d008; // TLB entry 1 tag == 0x0bfe [hit]
... // 1020 TLB hits in TLB entry 1
load 0x017f_dffc; // TLB entry 1 tag == 0x0bfe [hit]; last word in page
load 0x017f_e000; // TLB entry 0 tag (0x0bfe) != 0x0bff (page # 0x0_17fe) [miss]
// 2 memory accesses to fill TLB entry 0
load 0x017f_e004; // TLB entry 0 tag == 0x0bff [hit]
load 0x017f_e008; // TLB entry 0 tag == 0x0bff [hit]
... // 1020 TLB hits in TLB entry 0
load 0x017f_effc; // TLB entry 0 tag == 0x0bff [hit]; last word in page
load 0x017f_f000; // TLB entry 1 tag (0x0bfe) != 0x0bff (page # 0x0_17ff) [miss]
// 2 memory accesses to fill TLB entry 1
load 0x017f_f004; // TLB entry 1 tag == 0x0bff [hit]
load 0x017f_f008; // TLB entry 1 tag == 0x0bff [hit]
... // 1020 TLB hits in TLB entry 1
load 0x017f_fffc; // TLB entry 1 tag == 0x0bff [hit]; last word in page
Each page is referenced 1024 times (once for each four byte element) in sequence and then is never referenced again.
(Now consider a design with four TLB entries and two entries caching page directory entries [each of which has the pointer to the page of page table entries]. Each cached PDE will be reused for 1023 page look-ups, reducing them to one memory access each. [If the 8 MiB streaming access was repeated as an inner loop and was 4 MiB aligned, a two-entry PDE cache would be fully warmed up after the first iteration and all subsequent page table look-ups would only require one memory reference.])

Related

Size in MB of mnesia table

How do you read the :mnesia.info?
For example I only have one table, some_table, and :mnesia.info returns me this.
---> Processes holding locks <---
---> Processes waiting for locks <---
---> Participant transactions <---
---> Coordinator transactions <---
---> Uncertain transactions <---
---> Active tables <---
some_table: with 16020 records occupying 433455 words of mem
schema : with 2 records occupying 536 words of mem
===> System info in version "4.15.5", debug level = none <===
opt_disc. Directory "/home/ubuntu/project/Mnesia.nonode#nohost" is NOT used.
use fallback at restart = false
running db nodes = [nonode#nohost]
stopped db nodes = []
master node tables = []
remote = []
ram_copies = ['some_table',schema]
disc_copies = []
disc_only_copies = []
[{nonode#nohost,ram_copies}] = [schema,'some_table']
488017 transactions committed, 0 aborted, 0 restarted, 0 logged to disc
0 held locks, 0 in queue; 0 local transactions, 0 remote
0 transactions waits for other nodes: []
Also calling:
:mnesia.table_info("some_table", :size)
It returns me 16020 which I think is the number of keys, but how can I get the memory usage?
First, you need mnesia:table_info(Table, memory) to obtain the number of words occupied by your table, in your example you are getting the number of items in the table, not the memory. To transform that value to MB, you can first use erlang:system_info(wordsize) to get the word size in bytes for your machine architecture(on a 32 bit system a word is 4 bytes and 64 bits it's 8 bytes), multiply it by your Mnesia table memory to obtain the size in bytes and finally transform the value to MegaBytes like:
MnesiaMemoryMB = (mnesia:table_info("some_table", memory) * erlang:system_info(wordsize)) / (1024*1024).
You can use erlang:system_info(wordsize) to get the word size in bytes, on a 32 bit system a word is 32 bits or 4 bytes, on 64 bit it's 8 bytes. So your table is using 433455 x wordsize.

Mnesia: always suffix fragmented table fragments?

When I create a fragmented table in Mnesia, all of the table fragments will have the suffix _fragN except for the first fragment. This is error-prone, since any code that accesses the table without specifying the correct access module will appear to work, since it reads from and writes to the first fragment, but it will not mix with code using the correct access module, since they will be looking for elements in different places.
Is there a way to tell Mnesia to use a fragment suffix for all table fragments? That would avoid that problem, by making incorrect accesses fail noisily.
For example, if I create a table with four fragments:
1> mnesia:start().
ok
2> mnesia:create_table(foo, [{frag_properties, [{node_pool, [node()]}, {n_fragments, 4}]}]).
{atomic,ok}
then mnesia:info/0 will list the fragments as foo, foo_frag2, foo_frag3 and foo_frag4:
3> mnesia:info().
---> Processes holding locks <---
---> Processes waiting for locks <---
---> Participant transactions <---
---> Coordinator transactions <---
---> Uncertain transactions <---
---> Active tables <---
foo : with 0 records occupying 304 words of mem
foo_frag2 : with 0 records occupying 304 words of mem
foo_frag3 : with 0 records occupying 304 words of mem
foo_frag4 : with 0 records occupying 304 words of mem
schema : with 5 records occupying 950 words of mem
===> System info in version "4.14", debug level = none <===
opt_disc. Directory "/Users/legoscia/Mnesia.nonode#nohost" is NOT used.
use fallback at restart = false
running db nodes = [nonode#nohost]
stopped db nodes = []
master node tables = []
remote = []
ram_copies = [foo,foo_frag2,foo_frag3,foo_frag4,schema]
disc_copies = []
disc_only_copies = []
[{nonode#nohost,ram_copies}] = [schema,foo_frag4,foo_frag3,foo_frag2,foo]
3 transactions committed, 0 aborted, 0 restarted, 0 logged to disc
0 held locks, 0 in queue; 0 local transactions, 0 remote
0 transactions waits for other nodes: []
I'd want foo to be foo_frag1 instead. Is that possible?

Forcing a table rehash not working after a previous rehash

I've created a function that resizes an array and sets new entries to 0, but can also decrease the size of the array in 2 different ways:
1. Simply setting the n property to the new size (the length operator cannot be used because of this reason).
2. Setting all values after the new size to nil up to 2*size to force a rehash.
local function resize(array, elements, free)
local size = array.n
if elements < size then -- Decrease Size
array.n = elements
if free then
size = math.max(size, #array) -- In case of multiple resizes
local base = elements + 1
for idx = base, 2*size do -- Force a rehash -> free extra unneeded memory
array[idx] = nil
end
end
elseif elements > size then -- Increase Size
array.n = elements
for idx = size + 1, elements do
array[idx] = 0
end
end
end
How I tested it:
local mem = {n=0};
resize(mem, 50000)
print(mem.n, #mem) -- 50000 50000
print(collectgarbage("count")) -- relatively large number
resize(mem, 10000, true)
print(mem.n, #mem) -- 10000 10000
print(collectgarbage("count")) -- smaller number
resize(mem, 20, true)
print(mem.n, #mem) -- 20 20
print(collectgarbage("count")) -- same number as above, but it should be a smaller number
However when I don't pass true as the third argument to the second call of resize (so it doesn't force a rehash on the second call), the third call does end up rehashing it.
Am I missing something? I'm expecting the third one to also rehash after the second one has.
Here is a clearer picture of how the table usually looks like before and after the resizes:
table: 0x15bd3d0 n: 0 #: 0 narr: 0 nrec: 1
table: 0x15bd3d0 n: 50000 #: 50000 narr: 65536 nrec: 1
table: 0x15bd3d0 n: 10000 #: 10000 narr: 16384 nrec: 2
table: 0x15bd3d0 n: 20 #: 20 narr: 16384 nrec: 2
And here is what happens:
During the resize to 50000 elements, the table is rehashed several times, and at the end it contains exactly one hash part slot for the n field and enough array part slots for the integer keys.
During the shrinking to 10000 elements, you first assign nil to the integer keys 10001 to 65536, and then from 65537 to 100000. The first group of assignments will never cause a rehash, because you assign to existing fields. This has to do with the guarantees for the next function. The second group of assignments will cause rehashes, but since you are assinging nils, Lua will realize at some point that the array part of the table is more than half empty (see comment at the beginning of ltable.c). Lua will then shrink the array part to a reasonable size and use a second hash slot for the new key. But since you are assigning nils, that second hash slot is never occupied, and Lua is free to re-use it for all the remaining assignments (and it often but not always does). You wouldn't notice a rehash at this point anyway, because you will always end up with the 16384 array slots and 2 hash slots (one for n, one for the new element to be assigned).
The shrinking to 20 elements just continues this way, with the exception that a second hash slot is already available. So you might never get a rehash (and the array size stays larger than necessary), but if you do (Lua for some reason doesn't like the one free hash slot), you'll see the number of array slots drop to a reasonable level.
This is what it looks like when you do get a rehash during the second shrinking:
table: 0x11c43d0 n: 0 #: 0 narr: 0 nrec: 1
table: 0x11c43d0 n: 50000 #: 50000 narr: 65536 nrec: 1
table: 0x11c43d0 n: 10000 #: 10000 narr: 16384 nrec: 2
table: 0x11c43d0 n: 20 #: 20 narr: 32 nrec: 2
If you want to repeat my experiments, the git HEAD version of lua-getsize (original version here) now also returns the number of slots in the array/hash parts of a table.

What is the relation between address lines and memory?

These are my assignments:
Write a program to find the number of address lines in an n Kbytes of memory. Assume that n is always to the power of 2.
Sample input: 2
Sample output: 11
I don't need specific coding help, but I don't know the relation between address lines and memory.
To express in very easy terms, without any bus-multiplexing, the number of bits required to address a memory is the number of lines (address or data) required to access that memory.
Quoting from the Wikipedia article,
a system with a 32-bit address bus can address 232 (4,294,967,296) memory locations.
for a simple example, consider this, you have 3 address lines (A, B, C), so the values which can be formed using 3 bits are
A B C
0 0 0
0 0 1
0 1 0
0 1 1
1 0 0
1 0 1
1 1 0
1 1 1
Total 8 values. So using ABC, you can access any of those eight values, i.e., you can reach any of those memory addresses.
So, TL;DR, the simple relationship is, with n number of lines, we can represent 2n number of addresses.
An address line usually refers to a physical connection between a CPU/chipset and memory. They specify which address to access in the memory. So the task is to find out how many bits are required to pass the input number as an address.
In your example, the input is 2 kilobytes = 2048 = 2^11, hence the answer 11. If your input is 64 kilobytes, the answer is 16 (65536 = 2^16).

SE 4.10 bcheck <filename>, SE 2.10 bcheck <filename.ext> and other bcheck anomalies

ISQL-SE 4.10.DD6 (DOS 6.22):
BCHECK C-ISAM B-tree Checker version 4.10.DD6
C-ISAM File: c:\dbfiles.dbs\*.*
ERROR: cannot open C-ISAM file
In SE2.10 it worked with wilcards * .* for all files, but in SE4.10 it doesn’t. I have an SQL script which my users periodically run to reorg customer and transactions tables. Then I have a FIX.BAT DOS script [bcheck –y * .*] as a utility option for my users in case any tables get screwed up. Since users running the reorg will now increment the table version number, example: CUSTO102, 110, … now I’m going to have to devise a way to strip the .DAT extensions from the .DBS dir and feed it to BCHECK. Before, my reorg would always re-create a static CUSTOMER.DAT with CREATE TABLE customer IN “C:\DBFILES.DBS\CUSTOMER”; but that created the write permission problem and had to revert back to SE’s default datafile journaling…
Before running BCHECK on CUSTO102, its .IDX file size was 22,089 bytes and its .DAT size is 882,832 bytes.
After running BCHECK on CUSTO102, its .IDX size increased to 122,561 bytes, however a new .IDY file was created with 88,430 bytes..
What's a .IDY file ???
C:\DBFILES.DBS> bcheck –y CUSTO102
BCHECK C-ISAM B-tree Checker version 4.10.DD6
C-ISAM File: c:\dbfiles.dbs\CUSTO102
Checking dictionary and file sizes.
Index file node size = 512
Current C-ISAM index file node size = 512
Checking data file records.
Checking indexes and key descriptions.
Index 1 = unique key
0 index node(s) used -- 1 index b-tree level(s) used
Index 2 = duplicates (2,30,0)
42 index node(s) used -- 3 index b-tree level(s) used
Index 3 = unique key (32,5,0)
29 index node(s) used -- 2 index b-tree level(s) used
Index 4 = duplicates (242,4,2)
37 index node(s) used -- 2 index b-tree level(s) used
Index 5 = duplicates (241,1,0)
36 index node(s) used -- 2 index b-tree level(s) used
Index 6 = duplicates (46,4,2)
38 index node(s) used -- 2 index b-tree level(s) used
Checking data record and index node free lists.
ERROR: 177 missing index node pointer(s)
Fix index node free list ? yes
Recreating index node free list.
Recreating index 6.
Recreating index 5.
Recreating index 4.
Recreating index 3.
Recreating index 2.
Recreating index 1.
184 index node(s) used, 177 free -- 1083 data record(s) used, 0 free
The problem with the wild cards is more likely an issue with the command interpreter that was used to run bcheck than with bcheck itself. If you give bcheck a list of file names (such as 'abc def.dat def.idx', then it will process the C-ISAM file pairs (abc.dat, abc.idx), (def.dat, def.idx) and (def.dat, def.idx - again). Since it complained about being unable to open 'c:\dbfiles.dbs\*.*', it means that the command interpreter did not expand the '*.*' bit, or there was nothing for it to expand into.
I expect that the '.IDY' file is an intermediate used while rebuilding the indexes for the table. I do not know why it was not cleaned up - maybe the process did not complete.
About sizes, I think your table has about 55,000 rows of size 368 bytes each (SE might say 367; the difference is the record status byte at the end, which is either '\0' for deleted or '\n' for current). The unique index on the CHAR(5) column (index 3) requires 9 bytes per entry, or about 56 keys per index node, for about 1000 index nodes. The duplicate indexes are harder to size; you need space for the key value plus a list of 4-byte numbers for the duplicates, all packed into 512-byte pages. The 22 KB index file was missing a lot of information. The revised index file is about the right size. Note that index 1 is the 'ROWID' index; it does not occupy any space. (Index 1 is also why although every table created by SE is stored in a C-ISAM file, not all C-ISAM files are necessarily compatible with SE.)

Resources