local and dynamic allocating - dynamic-memory-allocation

I have a tree and I want to release the allocated memory, but I face a problem that a pointer may refers to a variable that isn't dynamically allocated,so how to know wether this pointer refers to dynamic a variable or not

This is compiler-specific. You may compare given pointer with pointer to a local variable. Result interpretation depends on the way compiler implements heap and stack. Generally, for given compiler, stack pointer is always less (or greater) than heap pointer.
In any case, THIS IS BAD DESIGN.
This may not work if pointer belongs to another heap (for example, allocated in another Dll).

Related

Delphi dynamic array efficiency

I am not a Delphi expert and I was reading online about dynamic arrays and static arrays. In this article I have found a chapter called "Dynamic v. Static Arrays" with a code snippet and below the author says:
[...] access to a dynamic array can be faster than a static array!
I have understood that dynamic arrays are located on the heap (they are implemented with references/pointers).
So far I know that the access time is better on dynamic arrays. But is that the same thing with the allocation? Like if I called SetLength(MyDynArray, 5) is that slower than creating a MyArray = array[0..4] of XXX?
So far I know that the access time is better on dynamic arrays.
That is not correct. The statement in that article is simply false.
But is that the same thing with the allocation? Like if I called SetLength(MyDynArray, 5) is that slower than creating a MyArray = array[0..4] of XXX?
A common fallacy is that static arrays are allocated on the heap. They could be global variables, and so allocated automatically when the module is loaded. They could be local variables and allocated on the stack. They could be dynamically allocated with calls to New or GetMem. Or they could be contained in a compound type (e.g. a record or a class) and so allocated in whatever way the owning object is allocated.
Having got that clear, let's consider a couple of common cases.
Local variable, static array type
As mentioned, static arrays declared as local variables are allocated on the stack. Allocation is automatic and essentially free. Think of the allocation as being performed by the compiler (when it generates code to reserve a stack frame). As such there is no runtime cost to the allocation. There may be a runtime cost to access because this might generate a page fault. That's all perfectly normal though, and if you want to use a small fixed size array as a local variable then there is no faster way to do it.
Member variable of a class, static array type
Again, as described above, the allocation is performed by the containing object. The static array is part of the space reserved for the object and when the object is instantiated sufficient memory is allocated on the heap. The cost for heap allocation does not typically depend significantly on the size of the block to be allocated. An exception to that statement might be really huge blocks but I'm assuming your array is relatively small in size, tens or hundreds of bytes. Armed with that knowledge we can see again that the cost for allocation is essentially zero, given that we are already allocating the memory for the containing object.
Local variable, dynamic array type
A dynamic array is represented by a pointer. So your local variable is a pointer allocated on the stack. The same argument applies as for any other local variable, for instance the local variable of static array type discussed above. The allocation is essentially free. Before you can do anything with this variable though, you need to allocate it with a call to SetLength. That incurs a heap allocation which is expensive. Likewise when you are done you have to deallocate.
Member variable of a class, dynamic array type
Again, allocation of the dynamic array pointer is free, but you must call SetLength to allocate. That's a heap allocation. There needs to be a deallocation too when the object is destroyed.
Conclusion
For small arrays, whose lengths are known at compile time, use of static arrays results in more efficient allocation and deallocation.
Note that I am only considering allocation here. If allocation is a relatively insignificant portion of the time spent working with the object then this performance characteristic may not matter. For instance, suppose the array is allocated at program startup, and then used repeatedly for the duration of the program. In such a scenario the access times dominate the allocation times and the difference between allocation times becomes insignificant.
On the flip side, imagine a short function called repeatedly during the programs lifetime, let's suppose this function is the performance bottleneck. If it operates on a small array, then it is possible that the allocation cost of using a dynamic array could be significant.
Very seldom can you draw hard and fast rules with performance. You need to understand how the tools work, and understand how your program uses these tools. You can then form opinions on which coding strategies might perform best, opinions that you should then test by profiling. You will be surprised more often than you might expect that your intuition is not a good predictor of performance.

Go memory layout compared to C++/C

In Go, it seems there are no constructors, but it is suggested that you allocate an object of a struct type using a function, usually named by "New" + TypeName, for example
func NewRect(x,y, width, height float) *Rect {
return &Rect(x,y,width, height)
}
However, I am not sure about the memory layout of Go. In C/C++, this kind of code means you return a pointer, which point to a temporary object because the variable is allocated on the stack, and the variable may be some trash after the function return. In Go, do I have to worry such kind of thing? Because It seems no standard shows that what kind of data will be allocated on the stack vs what kind of data will be allocated on the heap.
As in Java, there seems to have a specific point out that the basic type such as int, float will be allocated on the stack, other object derived from the object will be allocated on the heap. In Go, is there a specific talk about this?
The Composite Literal section mentions:
Taking the address of a composite literal (§Address operators) generates a unique pointer to an instance of the literal's value.
That means the pointer returned by the New function will be a valid one (allocated on the stack).
Calls:
In a function call, the function value and arguments are evaluated in the usual order.
After they are evaluated, the parameters of the call are passed by value to the function and the called function begins execution.
The return parameters of the function are passed by value back to the calling function when the function returns.
You can see more in this answer and this thread.
As mentioned in "Stack vs heap allocation of structs in Go, and how they relate to garbage collection":
It's worth noting that the words "stack" and "heap" do not appear anywhere in the language spec.
The blog post "Escape Analysis in Go" details what happens, mentioning the FAQ:
When possible, the Go compilers will allocate variables that are local to a function in that function's stack frame.
However, if the compiler cannot prove that the variable is not referenced after the function returns, then the compiler must allocate the variable on the garbage-collected heap to avoid dangling pointer errors.
Also, if a local variable is very large, it might make more sense to store it on the heap rather than the stack.
The blog post adds:
The code that does the “escape analysis” lives in src/cmd/gc/esc.c.
Conceptually, it tries to determine if a local variable escapes the current scope; the only two cases where this happens are when a variable’s address is returned, and when its address is assigned to a variable in an outer scope.
If a variable escapes, it has to be allocated on the heap; otherwise, it’s safe to put it on the stack.
Interestingly, this applies to new(T) allocations as well.
If they don’t escape, they’ll end up being allocated on the stack. Here’s an example to clarify matters:
var intPointerGlobal *int = nil
func Foo() *int {
anInt0 := 0
anInt1 := new(int)
anInt2 := 42
intPointerGlobal = &anInt2
anInt3 := 5
return &anInt3
}
Above, anInt0 and anInt1 do not escape, so they are allocated on the stack;
anInt2 and anInt3 escape, and are allocated on the heap.
See also "Five things that make Go fast":
Unlike C, which forces you to choose if a value will be stored on the heap, via malloc, or on the stack, by declaring it inside the scope of the function, Go implements an optimisation called escape analysis.
Go’s optimisations are always enabled by default.
You can see the compiler’s escape analysis and inlining decisions with the -gcflags=-m switch.
Because escape analysis is performed at compile time, not run time, stack allocation will always be faster than heap allocation, no matter how efficient your garbage collector is.

How to detect "dangling pointers" if "Assigned()" can't do it?

In another question, I found out that the Assigned() function is identical to Pointer <> nil. It has always been my understanding that Assigned() was detecting these dangling pointers, but now I've learned it does not. Dangling Pointers are those which may have been created at one point, but have since been free'd and haven't been assigned to nil yet.
If Assigned() can't detect dangling pointers, then what can? I'd like to check my object to make sure it's really a valid created object before I try to work with it. I don't use FreeAndNil as many recommend, because I like to be direct. I just use SomeObject.Free.
Access Violations are my worst enemy - I do all I can to prevent their appearance.
If you have an object variable in scope and it may or may not be a valid reference, FreeAndNil is what you should be using. That or fixing your code so that your object references are more tightly managed so it's never a question.
Access Violations shouldn't be thought of as an enemy. They're bugs: they mean you made a mistake that needs fixed. (Or that there's a bug in some code you're relying on, but I find most often that I'm the one who screwed up, especially when dealing with the RTL, VCL, or Win32 API.)
It is sometimes possible to detect when the address a pointer points to resides in a memory block that is on the heap's list of freed memory blocks. However, this requires comparing the pointer to potentially every block in the heap's free list which could contain thousands of blocks. So, this is potentially a computationally intensive operation and something you would not want to do frequently except perhaps in a severe diagnostic mode.
This technique only works while the memory block that the pointer used to point to continues to sit in the heap free list. As new objects are allocated from the heap, it is likely that the freed memory block will be removed from the heap free list and put back into active play as the home of a new, different object. The original dangling pointer still points to the same address, but the object living at that address has changed. If the newly allocated object is of the same (or compatible) type as the original object now freed, there is practically no way to know that the pointer originated as a reference to the previous object. In fact, in this very special and rare situation, the dangling pointer will actually work perfectly well. The only observable problem might be if someone notices that the data has changed out from under the pointer unexpectedly.
Unless you are allocating and freeing the same object types over and over again in rapid succession, chances are slim that the new object allocated from that freed memory block will be the same type as the original. When the types of the original and the new object are different, you have a chance of figuring out that the content has changed out from under the pointer. However, to do that you need a way to know the type of the original object that the pointer referred to. In many situations in native compiled applications, the type of the pointer variable itself is not retained at runtime. A pointer is a pointer as far as the CPU is concerned - the hardware knows very little of data types. In a severe diagnostic mode it's conceivable that you could build a lookup table to associate every pointer variable with the type allocated and assigned to it, but this is an enormous task.
That's why Assigned() is not an assertion that the pointer is valid. It just tests that the pointer is not nil.
Why did Borland create the Assigned() function to begin with? To further hide pointerisms from novice and occasional programmers. Function calls are easier to read and understand than pointer operations.
The bottom line is that you should not be attempting to detect dangling pointers in code. If you are going to refer to pointers after they have been freed, set the pointer to nil when you free it. But the best approach is not to refer to pointers after they have been freed.
So, how do you avoid referring to pointers after they have been freed? There are a couple of common idioms that get you a long way.
Create objects in a constructor and destroy them in the destructor. Then you simply cannot refer to the pointer before creation or after destruction.
Use a local variable pointer that is created at the beginning of the function and destroyed as the last act of the function.
One thing I would strongly recommend is to avoid writing if Assigned() tests into your code unless it is expected behaviour that the pointer may not be created. Your code will become hard to read and you will also lose track of whether the pointer being nil is to be expected or is a bug.
Of course we all do make mistakes and leave dangling pointers. Using FreeAndNil is one cheap way to ensure that dangling pointer access is detected. A more effective method is to use FastMM in full debug mode. I cannot recommend this highly enough. If you are not using this wonderful tool, you should start doing so ASAP.
If you find yourself struggling with dangling pointers and you find it hard to work out why then you probably need to refactor the code to fit into one of the two idioms above.
You can draw a parallel with array indexing errors. My advice is not to check in code for validity of index. Instead use range checking and let the tools do the work and keep the code clean. The exception to this is where the input comes from outside your program, e.g. user input.
My parting shot: only ever write if Assigned if it is normal behaviour for the pointer to be nil.
Use a memory manager, such as FastMM, that provides debugging support, in particular to fill a block of freed memory with a given byte pattern. You can then dereference the pointer to see if it points at a memory block that starts with the byte pattern, or you can let the code run normallly ad raise an AV if it tries to access a freed memory block through a dangling pointer. The AV's reported memory address will usually be either exactly as, or close to, the byte pattern.
Nothing can find a dangling (once valid but then not) pointer. It's your responsibility to either make sure it's set to nil when you free it's content, or to limit the scope of the pointer variable to only be available within the scope it's valid. (The second is the better solution whenever possible.)
The core point is that the way how objects are implemented in Delphi has some built-in design drawbacks:
there is no distinction between an object and a reference to an object. For "normal" variables, say a scalar (like int) or a record, these two use cases can be well told apart - there's either a type Integer or TSomeRec, or a type like PInteger = ^Integer or PSomeRec = ^TSomeRec, which are different types. This may sound like a neglectable technicality, but it isn't: a SomeRec: TSomeRec denotes "this scope is the original owner of that record and controls its lifecycle", while SomeRec: PSomeRec tells "this scope uses a transient reference to some data, but has no control over the record's lifecycle. So, as dumb it may sound, for objects there's virtually no one who has denotedly control over other objects' lifecycles. The result is - surprise - that the lifecycle state of objects may in certain situations be unclear.
an object reference is just a simple pointer. Basically, that's ok, but the problem is that there's sure a lot of code out there which treats object references as if they were a 32bit or 64bit integer number. So if e.g. Embarcadero wanted to change the implementation of an object reference (and make it not a simple pointer any more), they would break a lot of code.
But if Embarcadero wanted to eliminate dangling object pointers, they would have to redesign Delphi object references:
when an object is freed, all references to it must be freed, too. This is only possible by double-linking both, i.e. the object instance must carry a list with all of the references to it, that is, all memory addresses where such pointers are (on the lowest level). Upon destruction, that list is traversed, and all those pointers are set to nil
a little more comfortable solution were that the "one" holding such a reference can register a callback to get informed when a referenced object is destroyed. In code: when I have a reference FSomeObject: TSomeObject I would want to be able to write in e.g. SetSomeObject: FSomeObject.OnDestruction := Self.HandleDestructionOfSomeObject. But then FSomeObject can't be a pointer; instead, it would have to be at least an (advanced) record type
Of course I can implement all that by myself, but that is tedious, and isn't it something that should be addressed by the language itself? They also managed to implement for x in ...

why is stack and heap both required for memory allocation

I've searched a while but no conclusive answer is present on why value types have to be allotted on the stack while the reference types i.e. dynamic memory or the objects have to reside on the heap.
why cannot the same be alloted on the stack?
They can be. In practice they're not because stack is a typically scarcer resource than heap and allocating reference types on the stack may exhaust it quickly. Further, if a function returns data allocated on its stack, it will require copying semantics on the caller's part or risk returning something that will be overwritten by the next function call.
Value types, typically local variables, can be brought in and out of scope quickly and easily with native machine instructions. Copy semantics for value types on return is trivial as most fit into machine registers. This happens often and should be as cheap as possible.
It is not correct that value types always live on the stack. Read Jon Skeet's article on the topic:
Memory in .NET - what goes where
I understand that the stack paradigm (nested allocations/deallocations) cannot handle certain algorithms which need non-nested object lifetimes.
just as the static allocation paradigm cannot handle recursive procedure calls. (e.g. naive calculation of fibonacci(n) as f(n-1) + f(n-2))
I'm not aware of a simple algorithm that would illustrate this fact though. any suggestions would be appreciated :-)
Local variables are allocated in the stack. If that was not the case, you wouldn't be able to have variables pointing to the heap when allocating variable's memory. You CAN allocate things in the stack if you want, just create a buffer big enough locally and manage it yourself.
Anything a method puts on the stack will vanish when the method exits. In .net and Java, it would be perfectly acceptable (in fact desirable) if a class object vanished as soon as the last reference to it vanished, but it would be fatal for an object to vanish while references to it still exist. It is not in the general case possible for the compiler to know, when a method creates an object, whether any references to that object will continue to exist after the method exits. Absent such assurance, the only safe way to allocate class objects is to store them on the heap.
Incidentally, in .net, one major advantage of mutable value types is that they can be passed by reference without surrendering perpetual control over them. If class 'foo', or a method thereof, has a structure 'boz' which one of foo's methods passes by reference to method 'bar', it is possible for bar, or the methods it calls, to do whatever they want to 'boz' until they return, but once 'bar' returns any references it held to 'boz' will be gone. This often leads to much safer and cleaner semantics than the promiscuously-sharable references used for class objects.

What's the size of a reference on the CLR

I was (purely out of curiosity) trying to find out what the size of an actual reference is when an allocation is made on the stack.
After reading this I still don't know (this answers it only for value types or type definitions), and I still cannot seem to find it anywhere.
So basically imagine a class as follows
class A
{
string a;
}
Now when an object of type A is instantiated, a reference to the string object would be stored on the stack, now what would the size of the allocation on the stack be?
Disclaimer: If I'm talking complete and utter nonsense please let me know :)
Just like the size of pointers, presumably, the size would be that of a native int: 32-bits on 32-bit platforms and 64-bits on a 64-bit platform.
It will be the size of IntPtr, either 32 or 64 bits, depending upon your environment.
Now when an object of type A is instantiated, a reference to the string object would be stored on the stack, now what would the size of the allocation on the stack be?
The string reference would in fact be stored on the heap, not the stack, since A is a reference type.

Resources