insufficient memory when using proc assoc in SAS - memory

I'm trying to run the following and I receive an error saying that ERROR: The SAS System stopped processing this step because of insufficient memory.
The dataset has about 1170(row)*90(column) records. What are my alternatives here?
The error infor. is below:
332 proc assoc data=want1 dmdbcat=dbcat pctsup=0.5 out=frequentItems;
333 id tid;
334 target item_new;
335 run;
----- Potential 1 item sets = 188 -----
Counting items, records read: 19082
Number of customers: 203
Support level for item sets: 1
Maximum count for a set: 136
Sets meeting support level: 188
Megs of memory used: 0.51
----- Potential 2 item sets = 17578 -----
Counting items, records read: 19082
Maximum count for a set: 119
Sets meeting support level: 17484
Megs of memory used: 1.54
----- Potential 3 item sets = 1072352 -----
Counting items, records read: 19082
Maximum count for a set: 111
Sets meeting support level: 1072016
Megs of memory used: 70.14
Error: Out of memory. Memory used=2111.5 meg.
Item Set 4 is null.
ERROR: The SAS System stopped processing this step because of insufficient memory.
WARNING: The data set WORK.FREQUENTITEMS may be incomplete. When this step was stopped there were
1089689 observations and 8 variables.

From the documentation (http://support.sas.com/documentation/onlinedoc/miner/em43/assoc.pdf):
Caution: The theoretical potential number of item sets can grow very
quickly. For example, with 50 different items, you have 1225 potential
2-item sets and 19,600 3-item sets. With 5,000 items, you have over 12
million of the 2-item sets, and a correspondingly large number of
3-item sets.
Processing an extremely large number of sets could cause your system
to run out of disk and/or memory resources. However, by using a higher
support level, you can reduce the item sets to a more manageable
number.
So - provide a support= option make sure it's sufficiently high, e.g.:
proc assoc data=want1 dmdbcat=dbcat pctsup=0.5 out=frequentItems support=20;
id tid;
target item_new;
run;

Is there a way to frame the data mining task so that it requires less memory for storage or operations? In other words, do you need all 90 columns or can you eliminate some? Is there some clear division within the data set such that PROC ASSOC wouldn't be expected to use those rows for its findings?
You may very well be up against software memory allocation limits here.

Related

Understanding AArch64 Translation Tables

I'm doing a hobby OS project and I an trying to get Virtual Memory set up. I had another project in an x86 architecture working with Page Tables but I am now learning ArmV8 now.
Now, I now that the maximum amount of bits used for addressing is 48[1]. The last 12 to 16 bits are used "as-is" to index within the selected region (depending on which granule size is selected[2]).
I just don't understand how we get those intermediate bits. Obviously the documentation is showing that intermediate tables are used[3] but it is quite unclear on how those tables are used.
In the first half of the following image, we see translation of an address with 4k granules and using 38 address bits.
I can't understand this image in the slightest. The "offsets", for example bits 38 to 30 point to an entry in the L1 table. How and where is this table defined ?
What I think is happening is, this a 12+8+8+8 address translation scheme. Starting from the right, 12 bits to find an offset within a 4096 block of memory. Right of that is 8 bits for L3, meaning that L3 indexes 256 blocks of 4096 bytes (1MB). Right of this, L2, has 8 bits also so 256 entries of (256*4096), totalling 256MB per L2 entry. Right of L2 is L1 with also 8 bits, 256 entries of 256MB means the total addressable memory is 64GB of physical RAM.
I don't think this is correct because that would only allow a 1:1 mapping of memory. Each table descriptor needs to carry some access flags and what not. Thus going back to the question of: how are those table defined. Each offset section is 8 bits and that's not enough to contain the address of a translation table.
Anyway, I am completely lost. I would appreciate if someone could give me a "plain english" explanation of how a translation table walk is done ? A graph would be nice but probably too much effort, I'll make one and share if after to help me synthesize the information. Or at least, if someone has one, a link to a good video/guide where the information isn't totally obfuscated ?
Here is the list of materials I have consulted:
https://developer.arm.com/documentation/den0024/a/The-Memory-Management-Unit/Translating-a-Virtual-Address-to-a-Physical-Address
https://forums.raspberrypi.com/viewtopic.php?t=227139
https://armv8-ref.codingbelief.com/en/chapter_d4/d42_4_translation_tables_and_the_translation_proces.html
https://github.com/bztsrc/raspi3-tutorial/blob/master/10_virtualmemory/mmu.c
[1]https://developer.arm.com/documentation/den0024/a/The-Memory-Management-Unit/Translation-tables-in-ARMv8-A
[2]https://developer.arm.com/documentation/den0024/a/The-Memory-Management-Unit/Translation-tables-in-ARMv8-A/Effect-of-granule-sizes-on-translation-tables
[3]https://developer.arm.com/documentation/den0024/a/The-Memory-Management-Unit/Translating-a-Virtual-Address-to-a-Physical-Address
The entire model behind translation tables arises from three values: the size of a translation table entry (TTE), the hardware page size (aka "translation granule"), and the amount of bits used for virtual addressing.
On arm64, TTEs are always 8 bytes. The hardware page size can be one of 4KiB, 16KiB or 64KiB (0x1000, 0x4000 or 0x10000 bytes), depending on both hardware support and runtime configuration. The amount of bits used for virtual addressing similarly depends on hardware support and runtime configuration, but with a lot more complex constraints.
By example
For the sake of simplicity, let's consider address translation under TTBR0_EL1 with no block mappings, no virtualization going on, no pointer authentication, no memory tagging, no "large physical address" support and the "top byte ignore" feature being inactive. And let's pick a hardware page size of 0x1000 bytes and 39-bit virtual addressing.
From here, I find it easiest to start at the result and go backwards in order to understand why we arrived here. So suppose you have a virtual address of 0x123456000 and the hardware maps that to physical address 0x800040000 for you. Because the page size is 0x1000 bytes, that means that for 0 <= n <= 0xfff, all accesses to virtual address 0x123456000+n will go to physical address 0x800040000+n. And because 0x1000 = 2^12, that means the lowest 12 bytes of your virtual address are not used for address translation, but indexing into the resulting page. Though the ARMv8 manual does not use this term, they are commonly called the "page offset".
63 12 11 0
+------------------------------------------------------------+-------------+
| upper bits | page offset |
+------------------------------------------------------------+-------------+
Now the obvious question is: how did we get 0x800040000? And the obvious answer is: we got it from a translation table. A "level 3" translation table, specifically. Let's defer how we found that for just a moment and suppose we know it's at 0x800037000. One thing of note is that translation tables adhere to the hardware page size as well, so we have 0x1000 bytes of translation information there. And because we know that one TTE is 8 bytes, that gives us 0x1000/8 = 0x200, or 512 entries in that table. 512 = 2^9, so we'll need 9 bits from our virtual address to index into this table. Since we already use the lower 12 bits as page offset, we take bits 20:12 here, which for our chosen address yield the value 0x56 ((0x123456000 >> 12) & 0x1ff). Multiply by the TTE size, add to the translation table address, and we know that the TTE that gave us 0x800040000 is written at address 0x8000372b0.
63 21 20 12 11 0
+------------------------------------------------------------+-------------+
| upper bits | L3 index | page offset |
+------------------------------------------------------------+-------------+
Now you repeat the same process over for how you got 0x800037000, which this time came from a TTE in a level 2 translation table. You again take 9 bits off your virtual address to index into that table, this time with an value of 0x11a ((0x123456000 >> 21) & 0x1ff).
63 30 29 21 20 12 11 0
+------------------------------------------------------------+-------------+
| upper bits | L2 index | L3 index | page offset |
+------------------------------------------------------------+-------------+
And once more for a level 1 translation table:
63 40 39 30 29 21 20 12 11 0
+------------------------------------------------------------+-------------+
| upper bits | L1 index | L2 index | L3 index | page offset |
+------------------------------------------------------------+-------------+
At this point, you used all 39 bits of your virtual address, so you're done. If you had 40-bit addressing, then there'd be another L0 table to go through. If you had 38-bit addressing, then we would've taken the L1 table all the same, but it would only span 0x800 bytes instead of 0x1000.
But where did the L1 translation table come from? Well, from TTBR0_EL1. Its physical address is just in there, serving as the root for address translation.
Now, to perform the actual translation, you have to do this whole process in reverse. You start with a translation table from TTBR0_EL1, but you don't know ad-hoc whether it's L0, L1, etc. To figure that out, you have to look at the translation granule and the number of bits used for virtual addressing. With 4KiB pages there's a 12-bit page offset and 9 bits for each level of translation tables, so with 39 bits you're looking at an L1 table. Then you take bits 39:30 of the virtual address to index into it, giving you the address of the L2 table. Rinse and repeat with bits 29:21 for L2 and 20:12 for L3, and you've arrived at the physical address of the target page.

Apache Ignite use too much RAM

I've tried to use Ignite to store events, but face a problem of too much RAM usage during inserting new data
I'm runing ignite node with 1GB Heap and default configuration
curs.execute("""CREATE TABLE trololo (id LONG PRIMARY KEY, user_id LONG, event_type INT, timestamp TIMESTAMP) WITH "template=replicated" """);
n = 10000
for i in range(200):
values = []
for j in range(n):
id_ = i * n + j
event_type = random.randint(1, 5)
user_id = random.randint(1000, 5000)
timestamp = datetime.datetime.utcnow() - timedelta(hours=random.randint(1, 100))
values.append("({id}, {user_id}, {event_type}, '{timestamp}')".format(
id=id_, user_id=user_id, event_type=event_type, uid=uid, timestamp=timestamp.strftime('%Y-%m-%dT%H:%M:%S-00:00')
))
query = "INSERT INTO trololo (id, user_id, event_type, TIMESTAMP) VALUES %s;" % ",".join(values)
curs.execute(query)
But after loading about 10^6 events, I got 100% CPU usage because all heap are taken and GC trying to clean some space (unsuccessfully)
Then I stop for about 10 minutes and after that GC succesfully clean some space and I could continue loading new data
Then again heap fully loaded and all over again
It's really strange behaviour and I couldn't find a way how I could load 10^7 events without those problems
aproximately event should take:
8 + 8 + 4 + 10(timestamp size?) is about 30 bytes
30 bytes x3 (overhead) so it should be less than 100bytes per record
So 10^7 * 10^2 = 10^9 bytes = 1Gb
So it seems that 10^7 events should fit into 1Gb RAM, isn't it?
Actually, since version 2.0, Ignite stores all in offheap with default settings.
The main problem here is that you generate a very big query string with 10000 inserts, that should be parsed and, of course, will be stored in heap. After decreasing this size for each query, you will get better results here.
But also, as you can see in doc for capacity planning, Ignite adds around 200 bytes overhead for each entry. Additionally, add around 200-300MB per node for internal memory and reasonable amount of memory for JVM and GC to operate efficiently
If you really want to use only 1gb heap you can try to tune GC, but I would recommend increasing heap size.

The Keil RTX RTOS thread stack size

In the Keil RTX RTOS configuration file, user could configure the default user thread stack size.
In general, the stack holds auto/local variables. The "ZI data" section holds uninitialized global variables.
So if I change the user thread stack size in the RTX configuration file, the stack size will increase and the "ZI data" section size will not increase.
I test it, the test result shows that if I increase user thread stack size. The "ZI data" section size will increase synchronously with the same size.
In my test program, there is 6 threads and each has 600 bytes stack. I use Keil to build the program and it shows me that:
Code (inc. data) RO Data RW Data ZI Data Debug
36810 4052 1226 380 6484 518461 Grand Totals
36810 4052 1226 132 6484 518461 ELF Image Totals (compressed)
36810 4052 1226 132 0 0 ROM Totals
==============================================================================
Total RO Size (Code + RO Data) 38036 ( 37.14kB)
Total RW Size (RW Data + ZI Data) 6864 ( 6.70kB)
Total ROM Size (Code + RO Data + RW Data) 38168 ( 37.27kB)
But if I changed each thread stack size to 800 bytes. Keil shows me as follows:
==============================================================================
Code (inc. data) RO Data RW Data ZI Data Debug
36810 4052 1226 380 7684 518461 Grand Totals
36810 4052 1226 132 7684 518461 ELF Image Totals (compressed)
36810 4052 1226 132 0 0 ROM Totals
==============================================================================
Total RO Size (Code + RO Data) 38036 ( 37.14kB)
Total RW Size (RW Data + ZI Data) 8064 ( 7.88kB)
Total ROM Size (Code + RO Data + RW Data) 38168 ( 37.27kB)
==============================================================================
The "ZI data" section size increase from 6484 to 7684 bytes. 7684 - 6484 = 1200 = 6 * 200. And 800 - 600 = 200.
So I see the thread stack is put in "ZI Data" section.
My question is:
Does it mean auto/local variables in the thread will be put in "ZI Data" section, when thread stack is put in "ZI data" section in RAM ?
If it's true, it means there is no stack section at all. There are only "RO/RW/ZI Data" and heap sections at all.
This article gives me the different answer. And I am a little confused about it now.
https://developer.mbed.org/handbook/RTOS-Memory-Model
The linker determines what memory sections exist. The linker creates some memory sections by default. In your case, three of those default sections are apparently named "RO Data", "RW Data", and "ZI Data". If you don't explicitly specify which section a variable should be located in then the linker will assign it to one of these default sections based on whether the variable is declared const, initialized, or uninitialized.
The linker is not automatically aware that you are using an RTOS. And it has no special knowledge of which variables you intend to use as thread stacks. So the linker does not automatically create independent memory sections for your thread stacks. Rather, the linker will treat the stack variables like any other variable and include them in one of the default memory sections. In your case the thread stacks are apparently being put into the ZI Data section by the linker.
If you want the linker to create special independent memory sections for your thread stacks then you have to explicitly tell the linker to do so via the linker command file. And then you also have to specify that the stack variables should be located in your custom sections. Consult the linker manual for details on how to do this.
Tasks stacks have to come from somewhere - in RTX by default they are allocated statically and are of fixed size.
The os_tsk_create_user() allows the caller to supply a stack that could be allocated in any manner (statically or from the heap; allocating from the caller stack is possible, but unusual, probably pointless and certainly dangerous) so long as it has 8 byte alignment. I find RTX's automatic stack allocation almost useless and seldom appropriate in all but the most trivial application.

erlang crash dump no more index entries

i have a problem about erlang.
One of my Erlang node crashes, and generates erl_crash.dump with reason no more index entries in atom_tab (max=1048576).
i checked the dump file and i found that there are a lot of atoms in the form of 'B\2209\000..., (about 1000000 entries)
=proc:<0.11744.7038>
State: Waiting
Name: 'B\2209\000d\022D.
Spawned as: proc_lib:init_p/5
Spawned by: <0.5032.0>
Started: Sun Feb 23 05:23:27 2014
Message queue length: 0
Number of heap fragments: 0
Heap fragment data: 0
Reductions: 1992
Stack+heap: 1597
OldHeap: 1597
Heap unused: 918
OldHeap unused: 376
Program counter: 0x0000000001eb7700 (gen_fsm:loop/7 + 140)
CP: 0x0000000000000000 (invalid)
arity = 0
do you have some experience about what they are?
Atoms
By default, the maximum number of atoms is 1048576. This limit can be raised or lowered using the +t option.
Note: an atom refers into an atom table which also consumes memory. The atom text is stored once for each unique atom in this table. The atom table is not garbage-collected.
I think that you produce a lot of atom in your program, the number of atom reach the number limition for atom.
You can use this +t option to change the number limition of atom in your erlang VM when your start your erlang node.
So it tells you, that you generate atoms. There is somewhere list_to_atom/1 which is called with variable argument. Because you have process with this sort of name, you register/2 processes with this name. It is may be your code or some third party module what you use. It's bad behavior. Don't do that and don't use modules which is doing it.
To be honest, I can imagine design where I would do it intentionally but it is very special case and it is obviously not the case when you ask this question.

SE 4.10 bcheck <filename>, SE 2.10 bcheck <filename.ext> and other bcheck anomalies

ISQL-SE 4.10.DD6 (DOS 6.22):
BCHECK C-ISAM B-tree Checker version 4.10.DD6
C-ISAM File: c:\dbfiles.dbs\*.*
ERROR: cannot open C-ISAM file
In SE2.10 it worked with wilcards * .* for all files, but in SE4.10 it doesn’t. I have an SQL script which my users periodically run to reorg customer and transactions tables. Then I have a FIX.BAT DOS script [bcheck –y * .*] as a utility option for my users in case any tables get screwed up. Since users running the reorg will now increment the table version number, example: CUSTO102, 110, … now I’m going to have to devise a way to strip the .DAT extensions from the .DBS dir and feed it to BCHECK. Before, my reorg would always re-create a static CUSTOMER.DAT with CREATE TABLE customer IN “C:\DBFILES.DBS\CUSTOMER”; but that created the write permission problem and had to revert back to SE’s default datafile journaling…
Before running BCHECK on CUSTO102, its .IDX file size was 22,089 bytes and its .DAT size is 882,832 bytes.
After running BCHECK on CUSTO102, its .IDX size increased to 122,561 bytes, however a new .IDY file was created with 88,430 bytes..
What's a .IDY file ???
C:\DBFILES.DBS> bcheck –y CUSTO102
BCHECK C-ISAM B-tree Checker version 4.10.DD6
C-ISAM File: c:\dbfiles.dbs\CUSTO102
Checking dictionary and file sizes.
Index file node size = 512
Current C-ISAM index file node size = 512
Checking data file records.
Checking indexes and key descriptions.
Index 1 = unique key
0 index node(s) used -- 1 index b-tree level(s) used
Index 2 = duplicates (2,30,0)
42 index node(s) used -- 3 index b-tree level(s) used
Index 3 = unique key (32,5,0)
29 index node(s) used -- 2 index b-tree level(s) used
Index 4 = duplicates (242,4,2)
37 index node(s) used -- 2 index b-tree level(s) used
Index 5 = duplicates (241,1,0)
36 index node(s) used -- 2 index b-tree level(s) used
Index 6 = duplicates (46,4,2)
38 index node(s) used -- 2 index b-tree level(s) used
Checking data record and index node free lists.
ERROR: 177 missing index node pointer(s)
Fix index node free list ? yes
Recreating index node free list.
Recreating index 6.
Recreating index 5.
Recreating index 4.
Recreating index 3.
Recreating index 2.
Recreating index 1.
184 index node(s) used, 177 free -- 1083 data record(s) used, 0 free
The problem with the wild cards is more likely an issue with the command interpreter that was used to run bcheck than with bcheck itself. If you give bcheck a list of file names (such as 'abc def.dat def.idx', then it will process the C-ISAM file pairs (abc.dat, abc.idx), (def.dat, def.idx) and (def.dat, def.idx - again). Since it complained about being unable to open 'c:\dbfiles.dbs\*.*', it means that the command interpreter did not expand the '*.*' bit, or there was nothing for it to expand into.
I expect that the '.IDY' file is an intermediate used while rebuilding the indexes for the table. I do not know why it was not cleaned up - maybe the process did not complete.
About sizes, I think your table has about 55,000 rows of size 368 bytes each (SE might say 367; the difference is the record status byte at the end, which is either '\0' for deleted or '\n' for current). The unique index on the CHAR(5) column (index 3) requires 9 bytes per entry, or about 56 keys per index node, for about 1000 index nodes. The duplicate indexes are harder to size; you need space for the key value plus a list of 4-byte numbers for the duplicates, all packed into 512-byte pages. The 22 KB index file was missing a lot of information. The revised index file is about the right size. Note that index 1 is the 'ROWID' index; it does not occupy any space. (Index 1 is also why although every table created by SE is stored in a C-ISAM file, not all C-ISAM files are necessarily compatible with SE.)

Resources