How to implement copy-on-write technique for stack management in postfix calculations of big numbers?
I want to optimize my system regarding operations like:
duplicate top of stack
swap the two top elements
copy the second element on stack to the top
rotate the stack making the third element on top
apply n-ary operations on the stack
A typical operation take a number of parameters from top of stack and leave a number of results instead.
My stack is implemented as an array where data grows from low-mem to hi-mem and pointers from hi-mem to low-mem.
As I can see, the problem is not in the copy-on-write technique per se, but in memory management and garbage collecting.
In Forth we cannot have operator overloading and should have an explicitly separate set of operations for each class of numbers. Then we have two options:
A set of operations over bigint objects (their handlers), with manual memory management, similar to operations over files, dynamic memory buffers, or other objects.
A full set of operations including the separate stack, like operations over floating-point numbers, with automatic memory management.
Note that option (2) contains (1) under the hood.
Reference counting
One approach to implement automatic memory management in option (2) is to use a dedicated stack (or maybe two stacks) and reference counting. The stack contains references to objects (buffers). The operations that alter data (like b::1+) should make copy of the buffer and replace the reference by a new one if the counter is greater than 1 (otherwise data can be altered in the place).
Placing an item (a reference) to the stack should increase its counter, removing from the stack should decrease its counter. When the counter become 0, the buffer should be freed.
Such operations like b::dup, b::over increase the counter. b::swap doesn't change any counter. b::drop decreases the counter (and frees the buffer if the counter is 0).
Moving an item from the stack into a bigint variable should not change the counter in the result. But the counter for the previous value of the variable (if any) should be decreased.
If named variables and constants are not enough (e.g., to have user-defined arrays), you may need to introduce into the API the pointers (a kind of anonymous variables) or handlers to bigint objects.
Immutable objects
Another approach to implement option (2) is to have immutable objects and garbage collection loop. In this approach we don't need to count references, but need to maintain the list of external pointers (including variables).
A simple example of such garbage collection can be found in the implementation of s-expressions by Peter Sovietov: Функциональное программирование на языке Форт ("Functional programming in Forth language" in Russian, but it can be easy translated using Google Translate).
I am using the C++ API for z3 so I don't need to worry about reference counting or memory management.
However, I would like to store information against the z3 AST using a std::map along the lines of std::map<Z3_ast, some_struct>.
When a particular Z3_ast object is deleted, I would like to remove its entry
from this map.
Is there some way to set up a call back function which will be called when
a Z3_ast object reference count returns to 0 and the Z3_ast object is deleted?
No, there is no such callback, but conceivable you could hack the API to provide one. However, for as long as there is a at least one reference to the Z3_ast, the reference count should never drop to 0 anyways (and your map holds a reference, i.e., the one in the map).
If you don't increment the reference count at the time at which a Z3_ast is stored in the map, the reference counting paradigm is broken out of, which will probably result in bugs that are very hard to debug.
I'm dabbling in Love2D using Lua and have just implemented a StateMachine to handle transitions between a set of states e.g. IntroState, MenuState, PlayState etc..
In previous programs I usally release objects and/or states that are only a "one-time-deal", iow will only be presented to the player once during the lifetime of the application. In C++ I use the sizeof operator which returns the size in bytes of the passed object, just to get some feedback of how much memory I release at a certain point.
Are there any corresponding keyword or trick in Lua to achieve this?
If you need fine-grained information, you can use getsize as #siffiejoe mentioned in combination with some table traversal to get to all local and global objects. If you need more coarse-grained approach, you can use collectgarbage('count') to get the total memory used by Lua.
This SO answer and this lua discussion on memory tracking may be of some help. Note that you don't have control over memory release as it's handled by the garbage collector (although there are several GC settings you can tweak).
As i know, volatile is usually used to prevent unexpected compile optimization during some hardware operations. But which scenes volatile should be declared in property definition puzzles me. Please give some representative examples.
Thx.
A compiler assumes that the only way a variable can change its value is through code that changes it.
int a = 24;
Now the compiler assumes that a is 24 until it sees any statement that changes the value of a. If you write code somewhere below above statement that says
int b = a + 3;
the compiler will say "I know what a is, it's 24! So b is 27. I don't have to write code to perform that calculation, I know that it will always be 27". The compiler may just optimize the whole calculation away.
But the compiler would be wrong in case a has changed between the assignment and the calculation. However, why would a do that? Why would a suddenly have a different value? It won't.
If a is a stack variable, it cannot change value, unless you pass a reference to it, e.g.
doSomething(&a);
The function doSomething has a pointer to a, which means it can change the value of a and after that line of code, a may not be 24 any longer. So if you write
int a = 24;
doSomething(&a);
int b = a + 3;
the compiler will not optimize the calculation away. Who knows what value a will have after doSomething? The compiler for sure doesn't.
Things get more tricky with global variables or instance variables of objects. These variables are not on stack, they are on heap and that means that different threads can have access to them.
// Global Scope
int a = 0;
void function ( ) {
a = 24;
b = a + 3;
}
Will b be 27? Most likely the answer is yes, but there is a tiny chance that some other thread has changed the value of a between these two lines of code and then it won't be 27. Does the compiler care? No. Why? Because C doesn't know anything about threads - at least it didn't used to (the latest C standard finally knows native threads, but all thread functionality before that was only API provided by the operating system and not native to C). So a C compiler will still assume that b is 27 and optimize the calculation away, which may lead to incorrect results.
And that's what volatile is good for. If you tag a variable volatile like that
volatile int a = 0;
you are basically telling the compiler: "The value of a may change at any time. No seriously, it may change out of the blue. You don't see it coming and *bang*, it has a different value!". For the compiler that means it must not assume that a has a certain value just because it used to have that value 1 pico-second ago and there was no code that seemed to have changed it. Doesn't matter. When accessing a, always read its current value.
Overuse of volatile prevents a lot of compiler optimizations, may slow down calculation code dramatically and very often people use volatile in situations where it isn't even necessary. For example, the compiler never makes value assumptions across memory barriers. What exactly a memory barrier is? Well, that's a bit far beyond the scope of my reply. You just need to know that typical synchronization constructs are memory barriers, e.g. locks, mutexes or semaphores, etc. Consider this code:
// Global Scope
int a = 0;
void function ( ) {
a = 24;
pthread_mutex_lock(m);
b = a + 3;
pthread_mutex_unlock(m);
}
pthread_mutex_lock is a memory barrier (pthread_mutex_unlock as well, by the way) and thus it's not necessary to declare a as volatile, the compiler will not make an assumption of the value of a across a memory barrier, never.
Objective-C is pretty much like C in all these aspects, after all it's just a C with extensions and a runtime. One thing to note is that atomic properties in Obj-C are memory barriers, so you don't need to declare properties volatile. If you access the property from multiple threads, declare it atomic, which is even default by the way (if you don't mark it nonatomic, it will be atomic). If you never access it from multiple thread, tagging it nonatomic will make access to that property a lot faster, but that only pays off if you access the property really a lot (a lot doesn't mean ten times a minute, it's rather several thousand times a second).
So you want Obj-C code, that requires volatile?
#implementation SomeObject {
volatile bool done;
}
- (void)someMethod {
done = false;
// Start some background task that performes an action
// and when it is done with that action, it sets `done` to true.
// ...
// Wait till the background task is done
while (!done) {
// Run the runloop for 10 ms, then check again
[[NSRunLoop currentRunLoop]
runUntilDate:[NSDate dateWithTimeIntervalSinceNow:0.01]
];
}
}
#end
Without volatile, the compiler may be dumb enough to assume, that done will never change here and replace !done simply with true. And while (true) is an endless loop that will never terminate.
I haven't tested that with modern compilers. Maybe the current version of clang is more intelligent than that. It may also depend on how you start the background task. If you dispatch a block, the compiler can actually easily see whether it changes done or not. If you pass a reference to done somewhere, the compiler knows that the receiver may the value of done and will not make any assumptions. But I tested exactly that code a long time ago when Apple was still using GCC 2.x and there not using volatile really caused an endless loop that never terminated (yet only in release builds with optimizations enabled, not in debug builds). So I would not rely on the compiler being clever enough to do it right.
Just some more fun facts about memory barriers:
If you ever had a look at the atomic operations that Apple offers in <libkern/OSAtomic.h>, then you might have wondered why every operation exists twice: Once as x and once as xBarrier (e.g. OSAtomicAdd32 and OSAtomicAdd32Barrier). Well, now you finally know it. The one with "Barrier" in its name is a memory barrier, the other one isn't.
Memory barriers are not just for compilers, they are also for CPUs (there exists CPU instructions, that are considered memory barriers while normal instructions are not). The CPU needs to know these barriers because CPUs like to reorder instructions to perform operations out of order. E.g. if you do
a = x + 3 // (1)
b = y * 5 // (2)
c = a + b // (3)
and the pipeline for additions is busy, but the pipeline for multiplication is not, the CPU may perform instruction (2) before (1), after all the order won't matter in the end. This prevents a pipeline stall. Also the CPU is clever enough to know that it cannot perform (3) before either (1) or (2) because the result of (3) depends on the results of the other two calculations.
Yet, certain kinds of order changes will break the code, or the intention of the programmer. Consider this example:
x = y + z // (1)
a = 1 // (2)
The addition pipe might be busy, so why not just perform (2) before (1)? They don't depend on each other, the order shouldn't matter, right? Well, it depends. Consider another thread monitors a for changes and as soon as a becomes 1, it reads the value of x, which should now be y+z if the instructions were performed in order. Yet if the CPU reordered them, then x will have whatever value it used to have before getting to this code and this makes a difference as the other thread will now work with a different value, not the value the programmer would have expected.
So in this case the order will matter and that's why barriers are needed also for CPUs: CPUs don't order instructions across such barriers and thus instruction (2) would need to be a barrier instruction (or there needs to be such an instruction between (1) and (2); that depends on the CPU). However, reordering instructions is only performed by modern CPUs, a much older problem are delayed memory writes. If a CPU delays memory writes (very common for some CPUs, as memory access is horribly slow for a CPU), it will make sure that all delayed writes are performed and have completed before a memory barrier is crossed, so all memory is in a correct state in case another thread might now access it (and now you also know where the name "memory barrier" actually comes from).
You are probably working a lot more with memory barriers than you are even aware of (GCD - Grand Central Dispatch is full of these and NSOperation/NSOperationQueue bases on GCD), that's why your really need to use volatile only in very rare, exceptional cases. You might get away writing 100 apps and never have to use it even once. However, if you write a lot low level, multi-threading code that aims to achieve maximum performance possible, you will sooner or later run into a situation where only volatile can grantee you correct behavior; not using it in such a situation will lead to strange bugs where loops don't seem to terminate or variables simply seem to have incorrect values and you find no explanation for that. If you run into bugs like these, especially if you only see them in release builds, you might miss a volatile or a memory barrier somewhere in your code.
A good explanation is given here: Understanding “volatile” qualifier in C
The volatile keyword is intended to prevent the compiler from applying any optimizations on objects that can change in ways that cannot be determined by the compiler.
Objects declared as volatile are omitted from optimization because their values can be changed by code outside the scope of current code at any time. The system always reads the current value of a volatile object from the memory location rather than keeping its value in temporary register at the point it is requested, even if a previous instruction asked for a value from the same object. So the simple question is, how can value of a variable change in such a way that compiler cannot predict. Consider the following cases for answer to this question.
1) Global variables modified by an interrupt service routine outside the scope: For example, a global variable can represent a data port (usually global pointer referred as memory mapped IO) which will be updated dynamically. The code reading data port must be declared as volatile in order to fetch latest data available at the port. Failing to declare variable as volatile, the compiler will optimize the code in such a way that it will read the port only once and keeps using the same value in a temporary register to speed up the program (speed optimization). In general, an ISR used to update these data port when there is an interrupt due to availability of new data
2) Global variables within a multi-threaded application: There are multiple ways for threads communication, viz, message passing, shared memory, mail boxes, etc. A global variable is weak form of shared memory. When two threads sharing information via global variable, they need to be qualified with volatile. Since threads run asynchronously, any update of global variable due to one thread should be fetched freshly by another consumer thread. Compiler can read the global variable and can place them in temporary variable of current thread context. To nullify the effect of compiler optimizations, such global variables to be qualified as volatile
If we do not use volatile qualifier, the following problems may arise
1) Code may not work as expected when optimization is turned on.
2) Code may not work as expected when interrupts are enabled and used.
volatile comes from C. Type "C language volatile" into your favourite search engine (some of the results will probably come from SO), or read a book on C programming. There are plenty of examples out there.
Is a z/OS PL/I CONTROLLED variable preserved between separate invocations of a procedure? Let’s suppose that we need a counter that is internal to a subroutine and preserved across invocations. The easiest way to do it would be to use a static variable initialized to zero and incremented on each entry to the subroutine. But you can’t do that if the program has to be reentrant. So the question is whether we have access to a controlled variable that was allocated in a previous call. Would the following code work?
PROC1: PROCEDURE OPTIONS(MAIN);
...
CALL A;
...
A: PROCEDURE;
DECLARE COUNT CONTROLLED ALIGNED FIXED BIN(15);
IF (ALLOCATION(COUNT) = 0)
THEN ALLOCATE COUNT INIT(1);
ELSE COUNT = COUNT + 1;
...
END A;
END PROC1;
According to the PL/I Language Reference, after you ALLOCATE a variable, you do not need to FREE it (though that is generally good practice), and “All controlled storage is freed at the end of the program.” It doesn’t say that storage is freed at the end of the block.
The PL/I Programming Guide provides some clues in the chapter Using PLIDUMP in the Locating Controlled Variables section, but it is not definitive. It says that the key to locating a controlled variable is to find its anchor. With NORENT WRITABLE there is an anchor in static storage. With NORENT NOWRITABLE(FWS) there is an address to an anchor automatic storage. (There is an extra level of indirection.) With NORENT NOWRITABLE(PRV) there appears to be a static table with an offset into a private table for each controlled variable. In other words, depending on the processing options, maybe the variable is accessible, and maybe it isn’t. It doesn’t say anything about using the RENT option.
Any thoughts?
As per the PL/I Programming Guide compile time option "RENT", Your code is "naturally reentrant" if it does not alter any of its static variables.
The RENT option specifies that the compiler is to take code that is not naturally reentrant and make it reentrant.
Thus, you can increment STATIC variable on each entry to subroutine if the program is compiled with RENT option.
Please refer to this link => Rent Option from PL/I Programming Guide
As per "PL/I Structured Programming" by J.K. Hughes, A REENTRANT procedure may be called by other procedures asynchronously. For example task B invokes SQRT function. While this function is in process of computing square root, task A (having higher execution priority than task B) needs to gain control of the system and use SQRT function. The SQRT function is interrupted and intermediate results for task B saved; then task A uses SQRT function.
When task A completes its execution, control is returned to task B at the point where it was interrupted. Then, task B completes its use of SQRT function.