Maximum call depth exceeded - lua

I am using an integration engine Iguana by iNTERFACEWARE that uses Lua as the scripting language.
I have a function that recursively calls itself to split the string based on number of words i.e if I huge text chunk is passed it is divided into chunks of 80 words let's say. Now this function works just fine upto some lines but beyond 3000 /4000 words, It breaks giving the following error:
call-stack-has-exceeded-maximum-of-depth-of-100
Here is the function that I am using:
Function to split text
What shall be done to fix this? I am on windows 8. Does it depend on the stack size of machine(windows 8 in my case) or the depth of recursive calls allowed by the software Iguana?
Any fix?

Related

forth implementation with JIT write protection?

I believe Apple has disabled being able to write and execute memory at the same time on the ARM64 architecture, see:
See mmap() RWX page on MacOS (ARM64 architecture)?
This makes it difficult to port implementations like jonesforth, which keeps generated code and the code to generate it (like the built-in assembler in jonesforth.f) in the same segment.
I thought I could do something like map the user space from start to HERE as 'r-x', and from here to the end as 'rw-'. Then I'd have to constantly remap memory as I compile new words, and I couldn't go and fix up previous words (I believe SCODE would make use of it).
Do you have any advice on how to handle such limitations ?
I guess I should look into other forth implementations that are running on M1 Macs.
A Forth implementation can have a problem with write-protected segments of code only when it generates machine code that should be executable at once. There is no such a problem if it uses threaded code. So it's supposed bellow that the Forth system have to generate machine code.
Data space and code space
Obviously, you have to separate code space from data space. Data space (at least mutable regions of, including regions for variables and data fields), as well as internal mutable memory regions and probably headers, should be mapped to 'rw-' segments. Code space should be mapped to 'r-x' segments.
The word here ( -- addr ) returns the address of the first cell available for reservation, which is writable for a program, and it should be always in an 'rw-' segment. You can have an internal word code::here ( -- addr ) that returns address in code space, if you need.
A decision for execution tokens is a compromise between speed and simplicity of implementation (an 'r-x' segment vs 'rw-'). The simplest case is that an execution token is represented by an address in an 'rw-' segment, and then execute does an additional dereferencing to get the corresponding address of code.
Code generation
In the given conditions we should generate machine code into an 'rw-' segment, but before this code is executed, this segment should be made 'r-x'.
Probably, the simplest solution is to allocate a memory block for every new definition, resize (minimize) the block on completion and make it 'r-x'. Possible disadvantages — losses due to page size (e.g. 4 KiB), and maybe memory fragmentation.
Changing protection of the main code segment starting from code::here also implies losses due to page size granularity.
Another variant is to break creating of a definition into two stages:
generate intermediate representation (IR) in a separate 'rw-' segment during compilation of a definition;
when the definition is completed, generate machine code in the main code segment from IR, and discard IR code.
Actually, it could be machine code on the first stage too, and then it's just relocated into another place on the second stage.
Before write to the main code segment you change it (or its part) to 'rw-', and after that revert it to 'r-x'.
A subroutine that translates IR code should be resided in another 'r-x' segment that you don't change.
Forth is agnostic to the format of generated code, and in a straightforward system only a handful of definitions "know" what format is generated. So only these definitions should be changed to generate IR code. If you relocate machine code, you probably don't need to change even these definitions.

postscript stack overflow differing on mac and linux and effected by font setting

To explain my question I will first provide some code and explain what it does:
% 1 1 65532{}for % cut off on mac with font set
% 1 1 99996{}for % cut off on mac without font set
% 1 1 300048{}for % cut off on linux with font set
% 1 1 300368{}for % cut off on linux without font set
% /Times-Roman findfont 10 scalefont setfont
showpage
When I un-comment lines 1 and 6 and run this postscript program on my mac I get a stack overflow however I don't get a stack overflow if I replace 65532 with any smaller integer. If I instead un-comment line 2 then again I have a stack overflow but not if I replace 99996 with any smaller integer. lines 3 and 4 are similar except on linux.
From this I have concluded the following: the postscript stack mac has length 99998 (3 numbers before the for loop and 99996 numbers created by the for loop gives 99999 numbers but this is one too many so the stack should be 99998 long). Similarly I have concluded that the linux stack is 300370 long. Let me know if any of this reasoning is incorrect. Also, why are they different?
Next, I have concluded that setting the font takes up 34464 items on the stack on mac but only 320 items on linux. Again, let me know if any of this seems incorrect.
First, it seems odd to me that setting the font should take up this much space at all, second why do they differ so much on mac and linux (over 100 times more space on mac)?
Finally, I'd like to know if there are any other operations in postscript that use up a large amount of space on the stack other than operations defined by the programmer. I'd consider a large amount to be greater than 100 items on the stack.
Thanks
A good place to learn more about this area is the PLRM appendix on Implementation Limits. Typical size of the operand stack in Level 1 was around 1000. In Level 2 (and beyond) the stack will grow to accommodate new objects but Postscript is not really suited to having so many objects on the stack.
Fonts are implemented as dictionary objects, so they do not take up space on the stack but they do consume memory in the PostScript Virtual Memory. On the stack itself it only takes up space for 1 object.
There are a handful of operators which can add many objects to the stack. Any kind of loop (like for which you've already discovered) can add objects to the stack. aload is another one since it spills the contents of an array on the stack.
Practically, the stack size limitation shouldn't be a problem. If you have lots of data, then it really ought to be stored in data structures like arrays, strings, and dictionaries. Most situations where you may be tempted to put large amounts of data on the stack can be rewritten to be less stack-heavy.
A large part of the design of PostScript is to make it very lightweight in memory usage for the the tasks it is good for. For example, the image operator in Level 1 takes a data-acquisition procedure rather than an array or string. The usual way to use this is to have the procedure read ahead in the source file
to find its samples. A large image can be displayed without having to store all the samples in memory all at once. The data-acquisition procedure can use a small string buffer to read samples and return them piece by piece.

Largest amount of entries in lua table

I am trying to build a Sieve of Eratosthenes in Lua and i tried several things but i see myself confronted with the following problem:
The tables of Lua are to small for this scenario. If I just want to create a table with all numbers (see example below), the table is too "small" even with only 1/8 (...) of the number (the number is pretty big I admit)...
max = 600851475143
numbers = {}
for i=1, max do
table.insert(numbers, i)
end
If I execute this script on my Windows machine there is an error message saying: C:\Program Files (x86)\Lua\5.1\lua.exe: not enough memory. With Lua 5.3 running on my Linux machine I tried that too, error was just killed. So it is pretty obvious that lua can´t handle the amount of entries.
I don´t really know whether it is just impossible to store that amount of entries in a lua table or there is a simple solution for this (tried it by using a long string aswell...)? And what exactly is the largest amount of entries in a Lua table?
Update: And would it be possible to manually allocate somehow more memory for the table?
Update 2 (Solution for second question): The second question is an easy one, I just tested it by running every number until the program breaks: 33.554.432 (2^25) entries fit in one one-dimensional table on my 12 GB RAM system. Why 2^25? Because 64 Bit per number * 2^25 = 2147483648 Bits which are exactly 2 GB. This seems to be the standard memory allocation size for the Lua for Windows 32 Bit compiler.
P.S. You may have noticed that this number is from the Euler Project Problem 3. Yes I am trying to accomplish that. Please don´t give specific hints (..). Thank you :)
The Sieve of Eratosthenes only requires one bit per number, representing whether the number has been marked non-prime or not.
One way to reduce memory usage would be to use bitwise math to represent multiple bits in each table entry. Current Lua implementations have intrinsic support for bitwise-or, -and etc. Depending on the underlying implementation, you should be able to represent 32 or 64 bits (number flags) per table entry.
Another option would be to use one or more very long strings instead of a table. You only need a linear array, which is really what a string is. Just have a long string with "t" or "f", or "0" or "1", at every position.
Caveat: String manipulation in Lua always involves duplication, which rapidly turns into n² or worse complexity in terms of performance. You wouldn't want one continuous string for the whole massive sequence, but you could probably break it up into blocks of a thousand, or of some power of 2. That would reduce your memory usage to 1 byte per number while minimizing the overhead.
Edit: After noticing a point made elsewhere, I realized your maximum number is so large that, even with a bit per number, your memory requirements would optimally be about 73 gigabytes, which is extremely impractical. I would recommend following the advice Piglet gave in their answer, to look at Jon Sorenson's version of the sieve, which works on segments of the space instead of the whole thing.
I'll leave my suggestion, as it still might be useful for Sorenson's sieve, but yeah, you have a bigger problem than you realize.
Lua uses double precision floats to represent numbers. That's 64bits per number.
600851475143 numbers result in almost 4.5 Terabytes of memory.
So it's not Lua's or its tables' fault. The error message even says
not enough memory
You just don't have enough RAM to allocate that much.
If you would have read the linked Wikipedia article carefully you would have found the following section:
As Sorenson notes, the problem with the sieve of Eratosthenes is not
the number of operations it performs but rather its memory
requirements.[8] For large n, the range of primes may not fit in
memory; worse, even for moderate n, its cache use is highly
suboptimal. The algorithm walks through the entire array A, exhibiting
almost no locality of reference.
A solution to these problems is offered by segmented sieves, where
only portions of the range are sieved at a time.[9] These have been
known since the 1970s, and work as follows
...

How to vectorize/group together many signals generated from Qsys to Altera Quartus

In the Altera Qsys, I am using ten input parallel ports (lets name them pio1 to pio10), each port is 12 bits. These parallel ports obtain values from the vhdl block in Quartus schematic. In the schematic bdf, I can see pio1 to pio10 from the nios ii system symbol so I can connect these pios to other blocks in my bdf.
My question is, how to vectorize these pio1 to pio10? Instead of seeing all ten pios one line by one line coming out from the Nios system symbol, what should I do in order to group all these ten pios so that I only see one instead of ten? From the one pio that I see, I can name it pio[1..10][1..12], the first bracket means pio1 to pio10, the second bracket means bit1 to bit 12 because each parallel port has 12 bits.
Could you please let me know how could I do that?

Comparing using Map Reduce(Cloudera Hadoop 0.20.2) two text files of size of almost 3GB

I'm trying to do the following in hadoop map/reduce( written in java, linux kernel OS)
Text files 'rules-1' and 'rules-2' (total 3GB in size) contains some rules, each rule are separated by endline character, so the files can be read using readLine() function.
These files 'rules-1' and 'rules-2' needs to be imported as a whole from hdfs in every map function in my cluster i.e. these file are not splittable across different map function.
Input to the mapper's map function is a text file called 'record' (each line is terminated by endline character), so from the 'record' file we get the (key, value) pair. The file is splittable and can be given as input to different map function used in the whole map/reduce process.
What needs to be done is compare each value(i.e. lines from record file) with the rules inside 'rules-1' and 'rules-2'
Problem is, if I pull out each line of rules-1 and rules-2 files to a static arraylist only once, so that each mapper can share the same arraylint and try to compare elements in the arraylist with the each input value from the record file, I get a memory overflow error, since 3GB cannot be stored at a time in the arraylist.
Alternatively, if I import only few lines from the rules-1 and rules-2 files at a time and compare them to each value, map/reduce is taking a lot time to finish its job.
Could you guys provide me any other alternative ideas how can this be done without the memory overflow error? Will it help if I put those file-1 and file-2 inside a hdfs supporting database or something? I'm going out of ideas actually.Would really appreciate if some of you guys could provide me your valuable suggestions.
Iif you input files are small - you can load them into static variables and use rules as an input.
If above is not a case I can suggest the following ways:
a) To give rule-1 and rule-2 high replication factor close to the number of nodes you have. Then you can read from HDFS rule=1 and rule-2 for each record in the input relatively efficient - because it will be sequential read from the local datanode.
b) If you can consider some hash function which, when applied to the rule and to the input string will predict without false negatives that they can match - then you can emit this hash for rules, input record and resolve all possible matches in the reducer. It will be very similar to the way how a join is done using MR
c) I would consider some other optimization techniques like building search trees, or sorting since otherwise the problem looks computationally expensive and will took forever...
On this page find Real-World Cluster Configurations
it will cover file size configuration
You could use the param "mapred.child.java.opts" in conf/mapred-site.xml to increase the memory for your mappers. You might not be able to run as many map slots per server but with more servers in your cluster you could still parallelize your job.
Read the content text file from the MapReduce function and read the keyword text file from the mapper function (for reading your HDFS) and split using StringTokenizer value.toString reading from MapReduce and in your mapper function write HDFS read text file code it will read line-by-line so use two while loops here you compare. Whenever you want data send it to reducer.
Split the 3gb text file into several text files and apply that all text files as usual MapReduce your previous program.
For splitting text file I written Java program and you decide how many lines you want write in each text file.

Resources