Go memory layout compared to C++/C - memory

In Go, it seems there are no constructors, but it is suggested that you allocate an object of a struct type using a function, usually named by "New" + TypeName, for example
func NewRect(x,y, width, height float) *Rect {
return &Rect(x,y,width, height)
}
However, I am not sure about the memory layout of Go. In C/C++, this kind of code means you return a pointer, which point to a temporary object because the variable is allocated on the stack, and the variable may be some trash after the function return. In Go, do I have to worry such kind of thing? Because It seems no standard shows that what kind of data will be allocated on the stack vs what kind of data will be allocated on the heap.
As in Java, there seems to have a specific point out that the basic type such as int, float will be allocated on the stack, other object derived from the object will be allocated on the heap. In Go, is there a specific talk about this?

The Composite Literal section mentions:
Taking the address of a composite literal (§Address operators) generates a unique pointer to an instance of the literal's value.
That means the pointer returned by the New function will be a valid one (allocated on the stack).
Calls:
In a function call, the function value and arguments are evaluated in the usual order.
After they are evaluated, the parameters of the call are passed by value to the function and the called function begins execution.
The return parameters of the function are passed by value back to the calling function when the function returns.
You can see more in this answer and this thread.
As mentioned in "Stack vs heap allocation of structs in Go, and how they relate to garbage collection":
It's worth noting that the words "stack" and "heap" do not appear anywhere in the language spec.
The blog post "Escape Analysis in Go" details what happens, mentioning the FAQ:
When possible, the Go compilers will allocate variables that are local to a function in that function's stack frame.
However, if the compiler cannot prove that the variable is not referenced after the function returns, then the compiler must allocate the variable on the garbage-collected heap to avoid dangling pointer errors.
Also, if a local variable is very large, it might make more sense to store it on the heap rather than the stack.
The blog post adds:
The code that does the “escape analysis” lives in src/cmd/gc/esc.c.
Conceptually, it tries to determine if a local variable escapes the current scope; the only two cases where this happens are when a variable’s address is returned, and when its address is assigned to a variable in an outer scope.
If a variable escapes, it has to be allocated on the heap; otherwise, it’s safe to put it on the stack.
Interestingly, this applies to new(T) allocations as well.
If they don’t escape, they’ll end up being allocated on the stack. Here’s an example to clarify matters:
var intPointerGlobal *int = nil
func Foo() *int {
anInt0 := 0
anInt1 := new(int)
anInt2 := 42
intPointerGlobal = &anInt2
anInt3 := 5
return &anInt3
}
Above, anInt0 and anInt1 do not escape, so they are allocated on the stack;
anInt2 and anInt3 escape, and are allocated on the heap.
See also "Five things that make Go fast":
Unlike C, which forces you to choose if a value will be stored on the heap, via malloc, or on the stack, by declaring it inside the scope of the function, Go implements an optimisation called escape analysis.
Go’s optimisations are always enabled by default.
You can see the compiler’s escape analysis and inlining decisions with the -gcflags=-m switch.
Because escape analysis is performed at compile time, not run time, stack allocation will always be faster than heap allocation, no matter how efficient your garbage collector is.

Related

Delphi dynamic array efficiency

I am not a Delphi expert and I was reading online about dynamic arrays and static arrays. In this article I have found a chapter called "Dynamic v. Static Arrays" with a code snippet and below the author says:
[...] access to a dynamic array can be faster than a static array!
I have understood that dynamic arrays are located on the heap (they are implemented with references/pointers).
So far I know that the access time is better on dynamic arrays. But is that the same thing with the allocation? Like if I called SetLength(MyDynArray, 5) is that slower than creating a MyArray = array[0..4] of XXX?
So far I know that the access time is better on dynamic arrays.
That is not correct. The statement in that article is simply false.
But is that the same thing with the allocation? Like if I called SetLength(MyDynArray, 5) is that slower than creating a MyArray = array[0..4] of XXX?
A common fallacy is that static arrays are allocated on the heap. They could be global variables, and so allocated automatically when the module is loaded. They could be local variables and allocated on the stack. They could be dynamically allocated with calls to New or GetMem. Or they could be contained in a compound type (e.g. a record or a class) and so allocated in whatever way the owning object is allocated.
Having got that clear, let's consider a couple of common cases.
Local variable, static array type
As mentioned, static arrays declared as local variables are allocated on the stack. Allocation is automatic and essentially free. Think of the allocation as being performed by the compiler (when it generates code to reserve a stack frame). As such there is no runtime cost to the allocation. There may be a runtime cost to access because this might generate a page fault. That's all perfectly normal though, and if you want to use a small fixed size array as a local variable then there is no faster way to do it.
Member variable of a class, static array type
Again, as described above, the allocation is performed by the containing object. The static array is part of the space reserved for the object and when the object is instantiated sufficient memory is allocated on the heap. The cost for heap allocation does not typically depend significantly on the size of the block to be allocated. An exception to that statement might be really huge blocks but I'm assuming your array is relatively small in size, tens or hundreds of bytes. Armed with that knowledge we can see again that the cost for allocation is essentially zero, given that we are already allocating the memory for the containing object.
Local variable, dynamic array type
A dynamic array is represented by a pointer. So your local variable is a pointer allocated on the stack. The same argument applies as for any other local variable, for instance the local variable of static array type discussed above. The allocation is essentially free. Before you can do anything with this variable though, you need to allocate it with a call to SetLength. That incurs a heap allocation which is expensive. Likewise when you are done you have to deallocate.
Member variable of a class, dynamic array type
Again, allocation of the dynamic array pointer is free, but you must call SetLength to allocate. That's a heap allocation. There needs to be a deallocation too when the object is destroyed.
Conclusion
For small arrays, whose lengths are known at compile time, use of static arrays results in more efficient allocation and deallocation.
Note that I am only considering allocation here. If allocation is a relatively insignificant portion of the time spent working with the object then this performance characteristic may not matter. For instance, suppose the array is allocated at program startup, and then used repeatedly for the duration of the program. In such a scenario the access times dominate the allocation times and the difference between allocation times becomes insignificant.
On the flip side, imagine a short function called repeatedly during the programs lifetime, let's suppose this function is the performance bottleneck. If it operates on a small array, then it is possible that the allocation cost of using a dynamic array could be significant.
Very seldom can you draw hard and fast rules with performance. You need to understand how the tools work, and understand how your program uses these tools. You can then form opinions on which coding strategies might perform best, opinions that you should then test by profiling. You will be surprised more often than you might expect that your intuition is not a good predictor of performance.

Delphi: What are the advantages of using System.New() instead of a local variable, other than just spare a tiny amount of memory?

Let's go back to the basics. Frankly, I have never used New and Dispose functions before. However, after I read the New() documentation and the included examples on the Embarcadero Technologies's website and the Delphi Basics explanation of New(), it leaves questions in my head:
What are the advantages of using System.New() instead of a local variable, other than just spare a tiny amount of memory?
Common code examples for New() are more or less as follows:
var
pCustRec : ^TCustomer;
begin
New(pCustRec);
pCustRec^.Name := 'Her indoors';
pCustRec^.Age := 55;
Dispose(pCustRec);
end;
In what circumstances is the above code more appropriate than the code below?
var
CustRec : TCustomer;
begin
CustRec.Name := 'Her indoors';
CustRec.Age := 55;
end;
If you can use a local variable, do so. That's a rule with practically no exceptions. This results in the cleanest and most efficient code.
If you need to allocate on the heap, use dynamic arrays, GetMem or New. Use New when allocating a record.
Examples of being unable to use the stack include structures whose size are not known at compile time, or very large structures. But for records, which are the primary use case for New, these concerns seldom apply.
So, if you are faced with a choice of stack vs heap for a record, invariably the stack is the correct choice.
From a different perspective:
Both can suffer from buffer overflow and can be exploited.
If a local variable overflows, you get stack corruption.
If a heap variable overflows, you get heap corruption.
Some say that stack corruptions are easier to exploit than heap corruptions, but that is not true in general.
Note there are various mechanisms in operating systems, processor architectures, libraries and languages that try to help preventing these kinds of exploits.
For instance there is DEP (Data Execution Prevention), ASLR (Address Space Layout Randomization) and more are mentioned at Wikipedia.
A local static variable reserves space on the limited stack. Allocated memory is located on the heap, which is basically all memory available.
As mentioned, the stack space is limited, so you should avoid large local variables and also large parameters which are passed by value (absence of var/const in the parameter declaration).
A word on memory usage:
1. Simple types (integer, char, string, double etc.) are located directly on the stack. The amount of bytes used can be determined by the sizeof(variable) function.
2. The same applies to record variables and arrays.
3. Pointers and Objects require 4/8 bytes.
Every object (that is, class instances) is always allocated on the heap.
Value structures (simple numerical types, records containing only those types) can be allocated on the heap.
Dynamic arrays and strings content are always allocated on the heap. Only the reference pointer can be allocated on the stack. If you write:
function MyFunc;
var s: string;
...
Here, 4/8 bytes are allocated on the stack, but the string content (the text characters) will always be allocated on the heap.
So using New()/Dispose() is of poor benefit. If it contains no reference-counted types, you may use GetMem()/FreeMem() instead, since there is no internal pointer to set to zero.
The main drawback of New() or Dispose() is that if an exception occur, you need to use a try...finally block:
var
pCustRec : ^TCustomer;
begin
New(pCustRec);
try
pCustRec^.Name := 'Her indoors';
pCustRec^.Age := 55;
finally
Dispose(pCustRec);
end;
end;
Whereas allocating on the stack let the compiler do it for you, in an hidden manner:
var
CustRec : TCustomer;
begin // here a try... is generated
CustRec.Name := 'Her indoors';
CustRec.Age := 55;
end; // here a finally + CustRec cleaning is generated
That's why I almost never use New()/Dispose(), but allocate on stack, or even better within a class.
2
The usual case for heap allocation is when the object must outlive the function that created it:
It is being returned as a function result or via a var/out parameter, either directly or by returning some container.
It's being stored in some object, struct or collection that is passed in or otherwise accessible inside the procedure (this includes being signaled/queued off to another thread).
In cases of limited stack space you might prefer allocation from the heap.
Ref.

Why does my program crash after I call ReallocMemory?

I'm trying to modify the VirtualTreeView to see data in the tree nodes in the design mode.
The allocating node memory is in the private static method so I can't do anything about it. I'm trying to reallocate the memory to match the new size then.
For the test purposes I'm trying to reallocate the same amount of memory:
ReallocMemory(Node, sizeof(Node^))
But the IDE hangs up in the random iteration throwing a lot of AV. Since my knowledge of memory allocation is pretty lacking I think I'm forgetting something. Could you point me please?
ReallocMemory is a function. It returns the new pointer value; it does not modify its argument. You want to call ReallocMem instead, or else use the result of the function:
ReallocMem(Node, SizeOf(Node^));
or
Node := ReallocMemory(Node, SizeOf(Node^));
When either of those functions cannot resize the block of memory in-place, it allocates new memory, copies the old contents into the new buffer, and then frees the original buffer. If you ignore the ReallocMemory result, then you have discarded the new pointer and retained the old, stale pointer in the Node variable. Continued use of a stale pointer would explain access violations and other unpredictable behavior.
There are two versions of those functions for C++ compatibility. C++ doesn't have Delphi's "compiler magic," which is what allows the compiler to have a single ReallocMem function that accepts and modifies any pointer type.
The ReallocMemory function looks like the C++ realloc function, but they don't behave quite the same way, which is why it's safe to directly overwrite the input variable with the function's return value. When reallocation fails, the function throws an exception, just like ReallocMem, where as realloc just returns a null pointer.

why is stack and heap both required for memory allocation

I've searched a while but no conclusive answer is present on why value types have to be allotted on the stack while the reference types i.e. dynamic memory or the objects have to reside on the heap.
why cannot the same be alloted on the stack?
They can be. In practice they're not because stack is a typically scarcer resource than heap and allocating reference types on the stack may exhaust it quickly. Further, if a function returns data allocated on its stack, it will require copying semantics on the caller's part or risk returning something that will be overwritten by the next function call.
Value types, typically local variables, can be brought in and out of scope quickly and easily with native machine instructions. Copy semantics for value types on return is trivial as most fit into machine registers. This happens often and should be as cheap as possible.
It is not correct that value types always live on the stack. Read Jon Skeet's article on the topic:
Memory in .NET - what goes where
I understand that the stack paradigm (nested allocations/deallocations) cannot handle certain algorithms which need non-nested object lifetimes.
just as the static allocation paradigm cannot handle recursive procedure calls. (e.g. naive calculation of fibonacci(n) as f(n-1) + f(n-2))
I'm not aware of a simple algorithm that would illustrate this fact though. any suggestions would be appreciated :-)
Local variables are allocated in the stack. If that was not the case, you wouldn't be able to have variables pointing to the heap when allocating variable's memory. You CAN allocate things in the stack if you want, just create a buffer big enough locally and manage it yourself.
Anything a method puts on the stack will vanish when the method exits. In .net and Java, it would be perfectly acceptable (in fact desirable) if a class object vanished as soon as the last reference to it vanished, but it would be fatal for an object to vanish while references to it still exist. It is not in the general case possible for the compiler to know, when a method creates an object, whether any references to that object will continue to exist after the method exits. Absent such assurance, the only safe way to allocate class objects is to store them on the heap.
Incidentally, in .net, one major advantage of mutable value types is that they can be passed by reference without surrendering perpetual control over them. If class 'foo', or a method thereof, has a structure 'boz' which one of foo's methods passes by reference to method 'bar', it is possible for bar, or the methods it calls, to do whatever they want to 'boz' until they return, but once 'bar' returns any references it held to 'boz' will be gone. This often leads to much safer and cleaner semantics than the promiscuously-sharable references used for class objects.

Is it possible to get the size of the type that a pointer points to in Delphi 7?

I want to get the size of any "record" type in following function. But seems it doesn't work:
function GetDataSize(P : Pointer) : Integer;
begin
Result := SizeOf(P^); // **How to write the code?**
end;
For example, the size of following record is 8 bytes
SampleRecord = record
Age1 : Integer;
Age2 : Integer;
end;
But GetDataSize(#a) always returns 1 (a is a variable of SampleRecord type of course). What should I do?
I noticed that Delphi has a procedure procedure New(var P: Pointer) which can allocate the memory block corresponds to the size of the type that P points to. How can it gets the size?
The reason New knows how much memory to allocate is that New is compiler magic. It's a language built-in, so when the compiler sees you call it, it rewrites it to something like this:
// New(foo);
foo := System._New(SizeOf(foo^), TypeInfo(TypeOf(foo^)));
TypeOf here is a made-up Delphi function for expository purposes. The compiler knows the declared type of foo because it knows where all your variable declarations are. You can look at the implementation of _New in System.pas. Similar rewriting occurs for Dispose so it knows what kind of finalization to do before freeing the memory.
The ideas of variables and declarations are compile-time concepts. At run time, they cease to exist. At run time, a pointer is just an address. The type of what it points to was determined at compile time. Types are what determine something's size.
If you need to write a function that accepts pointers to multiple things with different sizes, then you'll just have to provide a second parameter that describes what the first one points to.
Check out another question here, "How to know what type is a var." The asker wondered how to determine more information about a variable given only its address.
You cannot find the size of data structure using variable of type Pointer, because compiler cannot, make a guess and check it, since pointer can points to whatever data type you can think of. You can read some information here.
There's no safe way to determine the size of a record that a pointer points to. However, if you allocated the memory that the pointer points to, you can ask the size of that memory block. But then again, since you allocated that block, you should already know the size of that block!
The Delphi memory manager keeps track of every block of memory that gets allocated. With information from the memory manager it is possible to find this information, if your pointer points to the beginning of a memory block. However, if you allocated a large block of memory, loaded some data in it and your pointer points to some data inside this block, this method would be quite unreliable.
Also, if you use referenced types (dynamic arrays, strings, classes, etc.) in your record, the size it returns will still be unusable since you get the size of the reference (4 bytes) instead of the size of the data that is referenced to.
The NEW() command just uses the type information of the datatype that you pass to it to get it's size. To know how it does this exactly, you could just check the Delphi sourcecode. Open \source\Win32\rtl\sys\System.pas and search for "_New". (With the underscore in front of it. Using this sourcecode might help you to understand how Delphi handles memory allocations, although the sourcecode can be really complex.
Delphi has a built-in memory manager. I believe new has access to the heap object and uses HeapSize() (or similar routines) to get the size of a block, for some pointer.

Resources