Garbage collection and the memoization of a string-to-string function

Garbage collection and the memoization of a string-to-string function - lua

The following exercise comes from p. 234 of Ierusalimschy's Programming in Lua (4th edition). (NB: Earlier in the book, the author explicitly rejects the word memoization, and insists on using memorization instead. Keep this in mind as you read the excerpt below.)
Exercise 23.3: Imagine you have to implement a memorizing table for a function from strings to strings. Making the table weak will not do the removal of entries, because weak tables do not consider strings as collectable objects. How can you implement memorization in that case?
I am stumped!
Part of my problem is that I have not been able to devise a way to bring about the (garbage) collection of a string.
In contrast, with a table, I can equip it with finalizer that will report when the table is about to be collected. Is there a way to confirm that a given string (and only that string) has been garbage-collected?
Another difficulty is simply figuring out what the desired function's specification is. The best I can do is to figure out what it isn't. Earlier in the book (p. 225), the author gave the following example of a "memorizing" function:
Imagine a generic server that takes requests in the form of strings with Lua code. Each time it gets a request, it runs load on the string, and then calls the resulting function. However, load is an expensive function, and some commands to the server may be quite frequent. Instead of calling load repeatedly each time it receives a common command like "closeconnection()", the server can memorize the results from load using an auxiliary table. Before calling load, the server checks in the table whether the given string already has a translation. If it cannot find a match then (and only then) the server calls load and stores the result into the table. We can pack this behavior in a new function:
[standard memo(r)ized implementation omitted; see variant using a weak-value table below]
The savings with this scheme can be huge. However, it may also cause unsuspected waste. ALthough some commands epeat over and over, many other commands happen only once. Gradually, the ["memorizing"] table results accumulates all commands the server has ever received plus their respective codes; after enough time, this behavior will exhaust the server's memory.
A weak table provides a simple solution to this problem. If the results table has weak values, each garbage-collection cycle will remove all translations not in use at that moment (which means virtually all of them)1:
local results = {}
setmetatable(results, {__mode = "v"}) -- make values weak
function mem_loadstring (s)
local res = results[s]
if res == nil then -- results not available?
res = assert(load(s)) -- compute new results
result[s] = res -- save for later reuse
end
return res
end
As the original problem statement notes, this scheme won't work when the function to be memo(r)ized returns strings, because the garbage collector does not treat strings as "collectable".
Of course, if one is allowed to change the desired function's interface so that instead of returning a string, it returns a singleton table whose sole item is the real result string, then the problem becomes almost trivial, but I find it hard to believe that the author had such a crude "solution" in mind2.
In case it matters, I am using Lua 5.3.
1 As an aside, if the rationale for memo(r)ization is to avoid invoking load more often than necessary, the scheme proposed by the author does not make sense to me. It seems to me that this scheme is based on the assumption (a heuristic, really) that a translation that is used frequently, and thus would pay to memo(r)ize, is also one that is always reachable (and hence not collectable). I don't see why this should necessarily, or even likely, be the case.
2 One may be able to put lipstick on this pig in the form of a __tostring method that would allow the table (the one returned by the memo(r)ized function) to masquerade as a string in certain contexts; it's still a pig, though.

Your idea is correct: wrap string into a table (because table is collectable).
function memormoize (func_from_string_to_string)
local cached = {}
setmetatable(cached, {__mode = "v"})
return
function(s)
local c = cached[s] or {func_from_string_to_string(s)}
cached[s] = c
return c[1]
end
end
And I see no pigs in this solution :-)
one that is always reachable (and hence not collectable). I don't see why this should necessarily, or even likely, be the case.
There will be no "always reachable" items in a weak table.
But most frequent items will be recalculated only once per GC cycle.
The ideal solution (never collect frequently used items) would require more complex implementation.
For example, you can move items from normal cache to weak cache when item's "inactivity timer" reaches some threshold.

Related

How to efficiently extract all values in a large master table that start with a specified string

I'm designing the UI for a Lua program, one element of which requires the user to either select an existing value from a master table or create a new value in that table.
I would normally use an IUP list with EDITBOX = "YES".
However, the number of items that the user can select may run into many hundreds or possibly thousands, and the performance when populating the list in iup (and also selecting from it) is unacceptably slow. I cannot control the number of items in the table.
My current thinking is to create a list with an editbox, but without any values. As the user types into the editbox (after perhaps 2-3 characters) the list would populate with the subset of table items that start with the characters typed. The user could then select an item from the list or keep typing to narrow the options or create a new item.
For this to work, I need to be able to create a new table with the items from the master table that start with the entered characters.
One option would be to iterate through the master table using the Penlight 'startswith' function to create the new table:
require "pl.init"
local subtable = {} --empty result table
local startstring = "xyz" -- will actually be set by the iup control
for _, v in ipairs (mastertable) do
if stringx.startswith(v, startstring) then
table.insert(subtable,v)
end
end
However, I'm worried about the performance of doing that if the master table is huge. Is there a more efficient way to code this, or a different way I could implement the UI?

There are various approaches you can take to improve the big-O performance of your prefix search, at the cost of increased code complexity; that said, given the size of your dataset (thousands of items) and the intended use (triggered by user interaction, rather than e.g. game logic that needs to run every frame), I think a simple linear search over the options is almost certainly going to be fast enough.
To test this theory, I timed the following code:
local dict = {}
for word in io.lines('/usr/share/dict/words') do
table.insert(dict, word)
end
local matched = {}
local search = "^" .. (...)
for _,word in ipairs(dict) do
if word:match(search) then
table.insert(matched, word)
end
end
print('Found '..#matched..' words.')
I used /usr/bin/time -v and tried it with both lua 5.2 and luaJIT.
Note that this is fairly pessimistic compared to your code:
no attempt made to localize library functions that are repeatedly called, or use # instead of table.insert
timing includes not just the search but also the cost of loading the dictionary into memory in the first place
string.match is almost certainly slower than stringx.startswith
dictionary contains ~100k entries rather than the "hundreds to thousands" you expect in your application
Even with all those caveats, it costs 50-100ms in lua5.2 and 30-50ms in luaJIT, over 50 runs.
If I use os.clock() to time the actual search, it consistently costs about 10ms in lua5.2 and 3-4 in luajit.
Now, this is on a fairly fast laptop (Core i7), but also non-optimized code running on a dataset 10-100x larger than you expect to process; given that, I suspect that the naïve approach of just looping over the entries calling startswith will be plenty fast for your purposes, and result in code that's significantly simpler and easier to debug.

Lua / FiveM, why fetching hashed objects is slow?

My introduction to Lua has been through building a FiveM Role Play server using ESX v1.2. There's many inefficiencies in the code and I've isolated the majority of them which were causing "server thread hitches" in the server, but there's one thing I'm a little lost on.
There's an "extended player" object which holds a bunch of RP specific information aggregated from various database calls, commonly called xPlayer. Each player has a server id which is a number and there's a lua table called ESX.Players and the xPlayer is stored with a hash with ESX.Players[source] = xPlayer
There's a function:
ESX.GetPlayerFromId = function(source)
return ESX.Players[tonumber(source)]
end
which wraps the fetch by hash which is where we get lots of server hitches happening. In calling code I've taken out all processing logic to leave just this function call in place and the hitches happen still and commenting it the issue goes away. The question is why given it's just pulling from a hashtable causing lag when the things in the hashtable are already instantiated tables and are NOT reaching out to touch any IO or do anything fancy in the process? It's just a stored instance.
The one thing that stands out to me is there are a hand full of values on the xPlayer table followed by MANY functions. Is the weight of those functions on a Lua table enough to slow down returning a reference to that? This file shows where the extended player table is created and returned https://github.com/esx-framework/es_extended/blob/v1-final/server/classes/player.lua
I am in the process of pushing all those functions onto a dedicated utility table that will exist once and each of them take an xPlayer argument to be processed by the function. There's thousands of calls throughout the 100+ FiveM resources back onto the xPlayer.() calls, so the weight of code churn and regression testing is epic therefore right now the codebase is not in a state to release to a production server to test with 40+ players hammering it yet. Can anyone confirm if I'm on the right track, is lifting away the functions of the xPlayer tables likely to give me any performance improvement when fetching by hashed key? And if so why?

Why do we use data structures? (when no dynamic allocation is needed)

I'm pretty sure this is a silly newbie question but I didn't know it so I had to ask...
Why do we use data structures, like Linked List, Binary Search Tree, etc? (when no dynamic allocation is needed)
I mean: wouldn't it be faster if we kept a single variable for a single object? Wouldn't that speed up access time? Eg: BST possibly has to run through some pointers first before it gets to the actual data.
Except for when dynamic allocation is needed, is there a reason to use them?
Eg: using linked list/ BST / std::vector in a situation where a simple (non-dynamic) array could be used.

Each thing you are storing is being kept in it's own variable (or storage location). Data structures apply organization to your data. Imagine if you had 10,000 things you were trying to track. You could store them in 10,000 separate variables. If you did that, then you'd always be limited to 10,000 different things. If you wanted more, you'd have to modify your program and recompile it each time you wanted to increase the number. You might also have to modify the code to change the way in which the calculations are done if the order of the items changes because the new one is introduced in the middle.
Using data structures, from simple arrays to more complex trees, hash tables, or custom data structures, allows your code to both be more organized and extensible. Using an array, which can either be created to hold the required number of elements or extended to hold more after it's first created keeps you from having to rewrite your code each time the number of data items changes. Using an appropriate data structure allows you to design algorithms based on the relationships between the data elements rather than some fixed ordering, giving you more flexibility.
A simple analogy might help to understand. You could, for example, organize all of your important papers by putting each of them into separate filing cabinet. If you did that you'd have to memorize (i.e., hard-code) the cabinet in which each item can be found in order to use them effectively. Alternatively, you could store each in the same filing cabinet (like a generic array). This is better in that they're all in one place, but still not optimum, since you have to search through them all each time you want to find one. Better yet would be to organize them by subject, putting like subjects in the same file folder (separate arrays, different structures). That way you can look for the file folder for the correct subject, then find the item you're looking for in it. Depending on your needs you can use different filing methods (data structures/algorithms) to better organize your information for it's intended use.
I'll also note that there are times when it does make sense to use individual variables for each data item you are using. Frequently there is a mixture of individual variables and more complex structures, using the appropriate method depending on the use of the particular item. For example, you might store the sum of a collection of integers in a variable while the integers themselves are stored in an array. A program would need to be pretty simple though before the introduction of data structures wouldn't be appropriate.

Sorry, but you didn't just find a great new way of doing things ;) There are several huge problems with this approach.
How could this be done without requring programmers to massively (and nontrivially) rewrite tons of code as soon as the number of allowed items changes? Even when you have to fix your data structure sizes at compile time (e.g. arrays in C), you can use a constant. Then, changing a single constant and recompiling is sufficent for changes to that size (if the code was written with this in mind). With your approach, we'd have to type hundreds or even thousands of lines every time some size changes. Not to mention that all this code would be incredibly hard to read, write, maintain and verify. The old truism "more lines of code = more space for bugs" is taken up to eleven in such a setting.
Then there's the fact that the number is almost never set in stone. Even when it is a compile time constant, changes are still likely. Writing hundreds of lines of code for a minor (if it exists at all) performance gain is hardly ever worth it. This goes thrice if you'd have to do the same amount of work again every time you want to change something. Not to mention that it isn't possible at all once there is any remotely dynamic component in the size of the data structures. That is to say, it's very rarely possible.
Also consider the concept of implicit and succinct data structures. If you use a set of hard-coded variables instead of abstracting over the size, you still got a data structure. You merely made it implicit, unrolled the algorithms operating on it, and set its size in stone. Philosophically, you changed nothing.
But surely it has a performance benefit? Well, possible, although it will be tiny. But it isn't guaranteed to be there. You'd save some space on data, but code size would explode. And as everyone informed about inlining should know, small code sizes are very useful for performance to allow the code to be in the cache. Also, argument passing would result in excessive copying unless you'd figure out a trick to derive the location of most variables from a few pointers. Needless to say, this would be nonportable, very tricky to get right even on a single platform, and liable to being broken by any change to the code or the compiler invocation.
Finally, note that a weaker form is sometimes done. The Wikipedia page on implicit and succinct data structures has some examples. On a smaller scale, some data structures store much data in one place, such that it can be accessed with less pointer chasing and is more likely to be in the cache (e.g. cache-aware and cache-oblivious data structures). It's just not viable for 99% of all code and taking it to the extreme adds only a tiny, if any, benefit.

The main benefit to datastructures, in my opinion, is that you are relationally grouping them. For instance, instead of having 10 separate variables of class MyClass, you can have a datastructure that groups them all. This grouping allows for certain operations to be performed because they are structured together.
Not to mention, having datastructures can potentially enforce type security, which is powerful and necessary in many cases.
And last but not least, what would you rather do?
string string1 = "string1";
string string2 = "string2";
string string3 = "string3";
string string4 = "string4";
string string5 = "string5";
Console.WriteLine(string1);
Console.WriteLine(string2);
Console.WriteLine(string3);
Console.WriteLine(string4);
Console.WriteLine(string5);
Or...
List<string> myStringList = new List<string>() { "string1", "string2", "string3", "string4", "string5" };
foreach (string s in myStringList)
Console.WriteLine(s);

Does erlang implement record copy-and-modify in any clever way?

given:
-record(foo, {a, b, c}).
I do something like this:
Thing = #foo{a={1,2}, b={3,4}, c={5,6}},
Thing1 = Thing#foo{a={7,8}}.
From a semantic view, Thing and Thing1 are unique entities. However, from a language implementation standpoint, making a full copy of Thing to generate Thing1 would be intensely wasteful. For example, if the record were a megabyte in size and I made a thousand "copies," each modifying a couple of bytes, I've just burned a gigabyte. If the internal structure kept track of a representation of the parent structure and each derivative marked up that parent in a way that indicated its own change but preserved everyone elses' versions, the derivates could be created with a minimum of memory overhead.
My question is this: is erlang doing anything clever - internally - to keep the overhead of the usual erlang scribble;
Thing = #ridiculously_large_record,
Thing1 = make_modified_copy(Thing),
Thing2 = make_modified_copy(Thing1),
Thing3 = make_modified_copy(Thing2),
Thing4 = make_modified_copy(Thing3),
Thing5 = make_modified_copy(Thing4)
...to a minimum?
I ask because there would be a number of changes to the way that I did cross-process communications if this were the case.

The exact workings of the garbage collection and memory allocation is only known to a few. Thankfully, they are very happy to share their knowledge and the following is based on what I have learnt from the erlang-questions mailing list and by discussing with OTP developers.
When messaging between processes, the content is always copied as there is no shared heap between processes. The only exception is binaries bigger than 64 bytes, where only a reference is copied.
When executing code in one process, only parts are updated. Let's analyze tuples, as that is the example you provided.
A tuple is actually a structure that keeps references to the actual data somewhere on the heap (except for small integers and maybe one more data type which I can't remember). When you update a tuple, using for example setelement/3, a new tuple is created with the given element replaced, however for all other elements only the reference is copied. There is one exception which I have never been able to take advantage of.
The garbage collector keeps track of each tuple and understands when it is safe to reclaim any tuple that is no longer used. It might be that the data referenced by the tuple is still in use, in which case the data itself is not collected.
As always, Erlang gives you some tools to understand exactly what is going on. The efficiency guide details how to use erts_debug:size/1 and erts_debug:flat_size/1 to understand the size of the data structure when used internally in a process and when copied. The trace tools also allows you to understand when, what and how much was garbage collected.

The record foo is of arity four (holding four words), but the whole structure is 14 words in size. Any immediate (pids, ports, small integers, atoms, catch and nil) can be stored directly in the tuples array. Any other term which can't fit into a word, such as other tuples, are not stored directly but referenced by boxed pointers (a boxed pointer is an erlang term with a forwarding address to the real eterm ... just internals).
In your case a new tuple of same arity is created and the atom foo and all the pointers are copied from the previous tuple except for index two, a, which points to the new tuple {7,8} which constitutes 3 words. In all 5 + 3 new words are created on the heap and only 3 words are copied from the old tuple the other 9 words are not touched.
Excessively large tuples are not recommended. When updating a tuple, the whole tuple, i.e the array and not the deep content, needs to copied and then updated in other to preserve a persistent data structure. This will also generate increased garbage, forcing the garbage collector to heat up which also hurts performance. The dict and array modules avoids using large tuples for this reason and have a shallow tuple tree instead.

I can definitely verify what people have already pointed out:
a record is just a tuple with the record name as the first element and all the fields just the following tuple element
when an element of a tuple is changed, updating a field in a record in your case, only the top level tuple is new, all the elements are just reused
This works just because we have immutable data. So in your example each time you update a value in a #foo record none of the data in the elements are copied and only a new 4-element tuple (5 words) is created. Erlang will never does a deep copy in this type of operation or when passing arguments in function calls.

In conclusion:
Thing = #foo{a={1,2}, b={3,4}, c={5,6}},
Thing1 = Thing#foo{a={7,8}}.
Here, if Thing is not used again, it will probably be updated in place and copying of the tuple will be avoided, as the Efficiency Guide says. (tuple and record syntax is complied into something like setelement, I think)
Thing = #ridiculously_large_record,
Thing1 = make_modified_copy(Thing),
Thing2 = make_modified_copy(Thing1),
...
Here the tuples are actually copied every time.
I guess that it would be theoretically possible make an interesting optimization to this. If the compiler could perform escape analysis on the return value of make_modified_copy and detect that the only reference to it is the one returned, in could save this information about the function. When it encounter a call the that function it would know that it is safe to modify the return value in place.
This would only be possible to do on inter module calls, because of the code replace feature.
Maybe one day we will have it.

(Secure) Random string?

In Lua, one would usually generate random values, and/or strings by using math.random & math.randomseed, where os.time is used for math.randomseed.
This method however has one major weakness; The returned number is always just as random as the current time, AND the interval for each random number is one second, which is way too long if one needs many random values in a very short time.
This issue is even pointed out by the Lua Users wiki: http://lua-users.org/wiki/MathLibraryTutorial, and the corresponding RandomStringS receipe: http://lua-users.org/wiki/RandomStrings.
So I've sat down and wrote a different algorithm (if it even can be called that), that generates random numbers by (mis-)using the memory addresses of tables:
math.randomseed(os.time())
function realrandom(maxlen)
local tbl = {}
local num = tonumber(string.sub(tostring(tbl), 8))
if maxlen ~= nil then
num = num % maxlen
end
return num
end
function string.random(length,pattern)
local length = length or 11
local pattern = pattern or '%a%d'
local rand = ""
local allchars = ""
for loop=0, 255 do
allchars = allchars .. string.char(loop)
end
local str=string.gsub(allchars, '[^'..pattern..']','')
while string.len(rand) ~= length do
local randidx = realrandom(string.len(str))
local randbyte = string.byte(str, randidx)
rand = rand .. string.char(randbyte)
end
return rand
end
At first, everything seems perfectly random, and I'm sure they are... at least for the current program.
So my question is, how random are these numbers returned by realrandom really?
Or is there an even better way to generate random numbers in a shorter interval than one second (which kind of implies that os.time shouldn't be used, as explaind above), without relying on external libraries, AND, if possible, in an entirely crossplatform manner?
EDIT:
There seems to be a major misunderstanding regarding the way the RNG is seeded; In production code, the call to math.randomseed() happens just once, this was just a badly chosen example here.
What I mean by the random value is only random once per second, is easily demonstrated by this paste: http://codepad.org/4cDsTpcD
As this question will get downvoted regardless my edits, I also cancelled my previously accepted answer - In hope for a better one, even if just better opinions. I understand that issues regarding random values/numbers has been discussed many times before, but I have not found such a question that could be relevant to Lua - Please keep that in mind!

You should not call seed each time you call random, you ought to call it only once, on the program initialization (unless you get the seed from somewhere, for example, to replicate some previous "random" behaviour).
Standard Lua random generator is of poor quality in the statistical sense (as it is, in fact, standard C random generator), do not use it if you care for that. Use, for example, lrandom module (available in LuaRocks).
If you need more secure random, read from /dev/random on Linux. (I think that Windows should have something along the same lines — but you may need to code something in C to use it.)
Relying on table pointer values is a bad idea. Think about alternate Lua implementations, in Java, for example — there is no telling what they would return. (Also, the pointer values may be predictable, and they may be, under certain circumstances the same each time the program is invoked.)
If you want finer precision for the seed (and you will want this only if you're launching the program more often than once per second), you should use a timer with better resolution. For example, socket.gettime() from LuaSocket. Multiply it by some value, since math.randomseed is working with integer part only, and socket.gettime() returns time in (floating point) seconds.
require 'socket'
math.randomseed(socket.gettime() * 1e6)
for i = 1, 1e3 do
print(math.random())
end

This method however has one major
weakness; The returned number is
always just as random as the current
time, AND the interval for each random
number is one second, which is way too
long if one needs many random values
in a very short time.
It has those weaknesses only if you implement it incorrectly.
math.randomseed is supposed to be called sparingly - usually just once at the beginning of your program, and it usually seeds using os.time. Once the seed is set, you can use math.random many times, and it will yield random values.
See what happens on this sample:
> math.randomseed(1)
> return math.random(), math.random(), math.random()
0.84018771715471 0.39438292681909 0.78309922375861
> math.randomseed(2)
> return math.random(), math.random(), math.random()
0.70097636929759 0.80967634907443 0.088795455214007
> math.randomseed(1)
> return math.random(), math.random(), math.random()
0.84018771715471 0.39438292681909 0.78309922375861
When I change the seed from 1 to 2, I get different random results. But when I go back to 1, the "random sequence" is reset. I obtain the same values as before.
os.time() returns an ever-increasing number. Using it as a seed is appropriate; then you can invoke math.random forever and have different random numbers every time you invoke it.
The only scenario you have to be a bit worried about non-randomness is when your program is supposed to be executed more than once per second. In that case, as the others are saying, the simplest solution is using a clock with higher definition.
In other words:
Invoke math.randomseed with an appropiate seed (os.time() is ok 99% of the cases) at the beginning of your program
Invoke math.random every time you need a random number.
Regards!

Some thoughts on the first part of your question:
So my question is, how random are these numbers returned by realrandom really?
Your function is attempting to discover the address of a table by using a quirk of its default implementation of tostring(). I don't believe that the string returned by tostring{} has a specified format, or that the value included in that string has any documented meaning. In practice, it is derived from the address of something related to the specific table, and so distinct tables convert to distinct strings. However, the next version of Lua is free to change that to anything that is convenient. Worse, the format it takes will be highly platform dependent because it appears to use the %p format specifier to sprintf() which is only specified as being a sensible representation of a pointer.
There's also a much bigger issue. While the address of the nth table created in a process might seem random on your platform, tt might not be random at all. Or it might vary in only a few bits. For example, on my win7 box only a few bits vary, and not very randomly:
C:...>for /L %i in (1,1,20) do # lua -e "print{}"
table: 0042E5D8
table: 0061E5D8
table: 0024E5D8
table: 0049E5D8
table: 0042E5D8
table: 0042E5D8
table: 0042E5D8
table: 0064E5D8
table: 0042E5D8
table: 002FE5D8
table: 0042E5D8
table: 0049E5D8
table: 0042E5D8
table: 0042E5D8
table: 0042E5D8
table: 0024E5D8
table: 0042E5D8
table: 0042E5D8
table: 0061E5D8
table: 0042E5D8
Other platforms will vary, of course. I'd even expect there to be platforms where the address of the first allocated table is completely deterministic, and hence identical on every run of the program.
In short, the address of an arbitrary object in your process image is not a very good source of randomness.
Edit: For completeness, I'd like to add a couple of other thoughts that came to mind over night.
The stock tostring() function is supplied by the base library and implemented by the function luaB_tostring(). The relevant bit is this fragment:
switch (lua_type(L, 1)) {
...
default:
lua_pushfstring(L, "%s: %p", luaL_typename(L, 1), lua_topointer(L, 1));
break;
If you really are calling this function, then the end of the string will be an address, represented by standard C sprintf() format %p, strongly related to the specific table. One observation is that I've seen several distinct implementations for %p. Windows MSVCR80.DLL (the version of the C library used by the current release of Lua for Windows) makes it equivalent to %08X. My Ubuntu Karmic Koala box appears to make it equivalent to %#x which notably drops leading zeros. If you are going to parse out that part of the string, then you should do it in a way that is more flexible in the face of variation of the meaning of %p.
Note, also, that doing anything like this in library code may expose you to a couple of surprises.
First, if the table passed to tostring() has a metatable that provides the function __tostring(), then that function will be called, and the fragment quoted above will never be executed at all. In your case, that issue cannot arise because tables have individual metatables, and you didn't accidentally apply a metatable to your local table.
Second, by the time your module loads, some other module or user-supplied code might have replaced the stock tostring() with something else. If the replacement is benign, (such as a memoization wrapper) then it likely doesn't matter to the code as written. However, this would be a source of attack, and is entirely outside the control of your module. That doesn't strike me as a good idea if the goal is some kind of improved security for your random seed material.
Third, you might not be loaded in a stock Lua interpreter at all, and the larger application (Lightroom, WoW, Wireshark, ...) may choose to replace the base library functions with their own implementations. This is a much less likely issue for tostring(), but note that the base library's print() is a frequent target for replacement or removal in alternate implementations and there are modules (Lua Lanes, for one) that break if print is not the implementation in the base library.

A few important things come to mind:
In most other languages you typically only call the random 'seed' function once at the beginning of the program or perhaps at limited times throughout its execution. You generally do not want to call it each time you generate a random number/sequence. If you call it once when the program starts you get around the "once per second" limitation. By calling it each time you may actually end up with less randomness in your results.
Your realrandom() function seems to rely on a private implementation detail of Lua. What happens in the next major release if this detail changes to always return the same number, or only even numbers, etc.... Just because it works for now is not a strong enough guarantee, especially in the case of wanting a secure RNG.
When you say "everything seems perfectly random" how are you measuring this performance? We humans are terrible at determining if a sequence is random or not and just looking at a sequence of numbers would be virtually impossible to truly tell if they were random or not. There are many ways to quantify the "randomness" of a series including frequency distribution, autocorrelation, compression, and many more far beyond my understanding.
If you are writing a true "secure PRNG" for production do not write your own! Investigate and use a library or algorithm by experts who has spent years/decades studying, designing and trying to break it. True secure random number generation is hard.
If you need more info start on the PRNG article on Wikipedia and use the references/links there as needed.

Categories

HOME

ruby-on-rails

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart