When are module level "constant" arrays being initialized in F# - f#

The question came up while I was trying to write a lookup table and during debugging I had the impression, that the array might not be initialized at compile or load time, but rather in a lazy way... Unfortunately I could not find the answer in the MSDN chapter about arrays.
Lets look at some sample code first. The code is in a compiled application, not a script, btw.
module Foo =
let bar = [| for i in 0..9 -> yield (i*i) |]
When does Foo.bar get initialized?
At compile time?
At load time?
Lazy, when first accessed?
Never? (Stays an IEnumerator based sequence despite its array type?)
On a side note, my sequence expression is more complicated than the example above and also uses other functions within scope.
Are there cases which are handled in different ways, such as trivial vs complex eypression or long vs short array etc?

Foo.bar gets initialized at load time - or, when your application starts.
There are slight differences between how this is done for libraries and for applications. For applications, the compiler inserts appropriate initialization into the Main method. For libraries, the initialization check is inserted into static constructors of (I believe) all types, so when you access any type from a library, the initialization is done (this may be sometime after Main, but still before you run any code from the library).
It does not really depend on what the code is - if it is a value, it will be initialized. There are some values like Lazy<T> or IEnumerable<T> that do not immediately fully evaluate, but the value is initialized nevertheless.

Related

D/Dlang: Lua interface, any way to force users to have no access to intermediate objects?

Status: Sort of solved. Switching Lua.Ref (close equivalent to LuaD LuaObject) to struct as suggested in answer has solved most issues related to freeing references, and I changed back to similar mechanism LuaD uses. More about this in the end.
In one of my project, I am working with Lua interface. I have mainly borrowed the ideas from LuaD. The mechanism in LuaD uses lua_ref & lua_unref to be able to move lua table/function references in D space, but this causes heavy problems because the calls to destructors and their order is not guaranteed. LuaD usually segfaults at least at the program exit.
Because it seems that LuaD is not maintained anymore, I decided to write my own interface for my purposes. My Lua interface class is here: https://github.com/mkoskim/games/blob/master/engine/util/lua.d
Usage examples can be found here:
https://github.com/mkoskim/games/blob/master/demo/luasketch/luademo.d
And in case you need, the Lua script used by the example is here:
https://github.com/mkoskim/games/blob/master/demo/luasketch/data/test.lua
The interface works like this:
Lua.opIndex pushes global table and index key to stack, and return Top object. For example, lua["math"] pushes _G and "math" to stack.
Further accesses go through Top object. Top.opIndex goes deeper in the table hierarchy. Other methods (call, get, set) are "final" methods, which perform an operation with the table and key at the top of the stack, and clean the stack afterwards.
Close everything works fine, except this mechanism has nasty quirk/bug that I have no idea how to solve it. If you don't call any of those "final" methods, Top will leave table and key to the stack:
lua["math"]["abs"].call(-1); // Works. Final method (call) called.
lua["math"]["abs"]; // table ref & key left to stack :(
What I know for sure, is that playing with Top() destructor does not work, as it is not called immediately when object is not referenced anymore.
NOTE: If there is some sort of operator to be called when object is accessed as rvalue, I could replace call(), set() and get() methods with operator overloads.
Questions:
Is there any way to prevent users to write such expressions (getting Top object without calling any of "final" methods)? I really don't want users to write e.g. luafunc = lua["math"]["abs"] and then later try to call it, because it won't work at all. Not without starting to play with lua_ref & lua_unref and start fighting with same issues that LuaD has.
Is there any kind of opAccess operator overloading, that is, overloading what happens when object is used as rvalue? That is, expression "a = b" -> "a.opAssign(b.opAccess)"? opCast does not work, it is called only with explicit casts.
Any other suggestions? I internally feel that I am looking solution from wrong direction. I feel that the problem reside in the realm of metaprogramming: I am trying to "scope" things at expression level, which I feel is not that suitable for classes and objects.
So far, I have tried to preserve the LuaD look'n'feel at interface user's side, but I think that if I could change the interface to something like following, I could get it working:
lua.call(["math", "abs"], 1); // call lua.math.abs(2)
lua.get(["table", "x", "y", "z"], 2); // lua table.x.y.z = 2
...
Syntactically that would ensure that reference to lua object fetched by indexing is finally used for something in the expression, and the stack would be cleaned.
UPDATE: Like said, changing Lua.Ref to struct solved problems related to dereferencing, and I am again using reference mechanism similar to LuaD. I personally feel that this mechanism suits the LuaD-style syntax I am using, too, and it can be quite a challenge to make the syntax working correctly with other mechanisms. I am still open to hear if someone has ideas to make it work.
The system I sketched to replace references (to tackle the problem with objects holding references living longer than lua sandbox) would probably need different kind of interface, something similar I sketched above.
You also have an issue when people do
auto math_abs = lua["math"]["abs"];
math_abs.call(1);
math_abs.call(3);
This will double pop.
Make Top a struct that holds the stack index of what they are referencing. That way you can use its known scoping and destruction behavior to your advantage. Make sure you handle this(this) correctly as well.
Only pop in the destructor when the value is the actual top value. You can use a bitset in LuaInterface to track which stack positions are in use and put the values in it using lua_replace if you are worried about excessive stack use.

Dynamically modify symbol table at runtime (in C)

Is it possible to dynamically modify symbol table at runtime in C (in elf format on Linux)?
My eventual goal is the following:
Inside certain function say foo, I want to override malloc function to my custom handler my_malloc. But outside foo, any malloc should still call to malloc as in glibc.
Note: this is different from LD_PRELOAD which would override malloc during the entire program execution.
Is it possible to dynamically modify symbol table at runtime in C (in elf format on Linux)?
In theory this is possible, but in practice it's too hard to do.
Inside certain function say foo, I want to override malloc function to my custom handler my_malloc. But outside foo, any malloc should still call to malloc as in glibc.
Modifying symbol table (even if it were possible) would not get you to your desired goal.
All calls from anywhere inside your ELF binary (let's assume foo is in the main executable), resolve to the same PLT import slot malloc#plt. That slot is resolved to glibc malloc on the first call (from anywhere in your program, assuming you are not using LD_BIND_NOW=1 or similar). After that slot has been resolved, any further modification to the symbol table will have no effect.
You didn't say how much control over foo you have.
If you can recompile it, the problem becomes trivial:
#define malloc my_malloc
int foo() {
// same code as before
}
#undef malloc
If you are handed a precompiled foo.o, you are linking it with my_malloc.o, and you want to redirect all calls from inside foo.o from malloc to my_malloc, that's actually quite simple to do at the object level (i.e. before final link).
All you have to do is go through foo.o relocation records, and change the ones that say "put address of external malloc here" to "put address of external my_malloc here".
If foo.o contains additional functions besides foo, it's quite simple to limit the relocation rewrite to just the relocations inside foo.
Is it possible to dynamically modify symbol table at runtime in C (in elf format on Linux)?
Yes, it is not easy, but the functionality can be packaged into a library, so at the end of the day, it can be made practical.
Typemock Isolator++
(https://www.typemock.com/isolatorpp-product-page/isolate-pp/)
This is free-to-use, but closed source solution. The usage example from documentation should be instructive
TEST_F(IsolatorPPTests, IsExpired_YearIs2018_ReturnTrue) {
Product product;
// Prepare a future time construct
tm* fakeTime = new tm();
fakeTime->tm_year = 2018;
// Fake the localtime method
FAKE_GLOBAL(localtime);
// Replace the returned value when the method is called
// with the fake value.
WHEN_CALLED(localtime(_)).Return(fakeTime);
ASSERT_TRUE(product.IsExpired());
}
Other libraries of this kind
Mimick, from Q: Function mocking in C?
cpp-stub, from Q: Creating stub functionality in C++
Elfspy, for C++, but sometimes it's OK to test C code from C++ unittests, from Q: C++ mock framework capable of mocking non-virtual methods and C functions
HippoMocks, from Q: Mocking C functions in MSVC (Visual Studio)
the subprojects in https://github.com/coolxv/cpp-stub/tree/master/other
... there is still more, feel free to append ...
Alternate approaches
ld's --wrap option and linker scripts, https://gitlab.com/hedayat/powerfake
various approaches described in answers for Q: Advice on Mocking System Calls
and in answers to Q: How to mock library calls?
Rewrite code to make it testable
This is easier in other languages than C, but still doable even in C. Structure code into small functions without side-effects that can be unit-tested without resorting to trickery, and so on. I like this blog Modularity. Details. Pick One. about the tradeoffs this brings. Personally, I guturally dislike the "sea of small functions and tons of dependency injection" style of code, but I realize that that's the easiest to work with, all things considered.
Excursion to other languages
What you are asking for is trivial to do in Python, with the unittest.mock.patch, or possibly by just assigning the mock into the original function directly, and undoing that at the end of the test.
In Java, there is Mockito/PowerMock, which can be used to replace static methods for the duration of a test. Static methods in Java approximately correspond to regular functions in C.
In Go, there is Mockey, which works similarly to what needs to be done in C. It has similar limitations in that inlining can break this kind of runtime mocking. I am not sure if in C you can hit the issue that very short methods are unmockable because there is not enough space to inject the redirection code; I think more likely not, if all calls go through the Procedure Linkage Table.

cannot traverse the nodes of an AST, while assigning each node an ID

This is more a simple personal attempt to understand what goes on inside Rascal. There must be better (if not already supported) solution.
Here's the code:
fileLoad = |home:///PHPAnalysis/systems/ApilTestScripts/simple1.php|;
fileAST=loadPHPFile(fileLoad,true,false);
//assign a simple id to each node
public map[value,int] assignID12(node N)
{
myID=();
visit(N)
{
case node M:
{
name=getName(M);
myID[name] =999;
}
}
return myID;
}
ids=assignID12(fileAST);
gives me
|stdin:///|(92,4,<1,92>,<1,96>): Expected str, but got value
loadPHPFile returns a node of type: list[Stmt], where each Stmt is one of the many types of statements that could occur in a program (PHP, in my case). Without going into why I'd do this, why doesn't the above code work? Especially frustrating because a very simple example is worked out in the online documentation. See: http://tutor.rascal-mpl.org/Recipes/Basic/Basic.html#/Recipes/Common/CountConstructors/CountConstructors.html
I started a new console, and it seems to work. Of course, I changed the return type from map[value,int] to map[str,int] as it was originally in the example.
The problem I was having was that I may have erroneously defined the function previously. While I quickly fixed an apparent problem, it kept giving me errors. I realized that in Rascal, when you've started a console and imported certain definitions, it (seems)is impossible to overwrite those definitions. The interpreter keeps making reference to the very first definition that you provided. This could just be the interpreter performing a type-check, and preventing unintentional and/or incompatible assignments further down the road. That makes sense for variables (in the typical program sense), but it doesn't seem like the best idea to enforce that on functions (or methods). I feel it becomes cumbersome, because a user typically has to undergo some iterations before he/she is satisfied with a function definition. Just my opinion though...
Most likely you already had the name ids in scope as having type map[str,int], which would be the direct source of the error. You can look in script https://github.com/cwi-swat/php-analysis/blob/master/src/lang/php/analysis/cfg/LabelState.rsc at the function labelScript to see how this is done in PHP AiR (so you don't need to write this code yourself). What this will give you is a script where all the expressions and statements have an assigned ID, as well as the label state, which just keeps track of some info used in this labeling operation (mainly the counter to generate a unique ID).
As for the earlier response, the best thing to do is to give your definitions in modules which you can import. If you do that, any changes to types, etc will be picked up (automatically if the module is already imported, since Rascal will reimport the module for you if it has changed, or when you next import the module). However, if you define something directly in the console, this won't happen. Think of the console as one large module that you keep adding to. Since we can have overloads of functions, if you define the function again you are really defining a new alternative to the function, but this may not work like you expect.

Using Dart as a DSL

I am trying to use Dart to tersely define entities in an application, following the idiom of code = configuration. Since I will be defining many entities, I'd like to keep the code as trim and concise and readable as possible.
In an effort to keep boilerplate as close to 0 lines as possible, I recently wrote some code like this:
// man.dart
part of entity_component_framework;
var _man = entity('man', (entityBuilder) {
entityBuilder.add([TopHat, CrookedTeeth]);
})
// test.dart
part of entity_component_framework;
var man = EntityBuilder.entities['man']; // null, since _man wasn't ever accessed.
The entity method associates the entityBuilder passed into the function with a name ('man' in this case). var _man exists because only variable assignments can be top-level in Dart. This seems to be the most concise way possible to use Dart as a DSL.
One thing I wasn't counting on, though, is lazy initialization. If I never access _man -- and I had no intention to, since the entity function neatly stored all the relevant information I required in another data structure -- then the entity function is never run. This is a feature, not a bug.
So, what's the cleanest way of using Dart as a DSL given the lazy initialization restriction?
So, as you point out, it's a feature that Dart doesn't run any code until it's told to. So if you want something to happen, you need to do it in code that runs. Some possibilities
Put your calls to entity() inside the main() function. I assume you don't want to do that, and probably that you want people to be able to add more of these in additional files without modifying the originals.
If you're willing to incur the overhead of mirrors, which is probably not that much if they're confined to this library, use them to find all the top-level variables in that library and access them. Or define them as functions or getters. But I assume that you like the property that variables are automatically one-shot. You'd want to use a MirrorsUsed annotation.
A variation on that would be to use annotations to mark the things you want to be initialized. Though this is similar in that you'd have to iterate over the annotated things, which I think would also require mirrors.

What percent of functions on OS X are called by the Objective-C runtime?

I'd like to get a firmer grasp of how frequently the runtime in any language that requires one is being called. In this case, I'm specifically interested in knowing:
Of all the function calls getting executed on an OS X or iOS system in any given second (approximations are of course necessary) how many of those are Objective-C runtime functions (i.e. functions that are defined by the runtime)?
Of course it depends on your application, but in general the answer is "a whole lot". Like, a whole freaking lot.
If you really want to see numbers, I'd recommend using dtrace to log all runtime functions as they're called. This blog entry talks about how to do such a thing.
A lot. Here are just a few examples.
Every time you send a message, the actual message sending is done by a runtime function (this is in fact the most called runtime function in pretty much any objective C program).
NSObject class and protocol are not part of the standard library but part of the runtime, therefore any method that ends up executing to the default NSObject implementation is in fact executing runtime code.
Every time you execute a default property accessor (either read or write), that's part of the runtime.
If you use ARC, every time you access a weak reference (either for reading or writing it) that's a runtime function.
Objc runtime includes the C runtime, so anything that involves a C runtime function (for example passing a large structure by value or returning it) is in fact calling into the runtime.
and more.

Resources