Better to allocate too much memory on the stack, or the correct amount on the heap? - memory

I have just started learning rust, and it is my first proper look into a low level language (I do python, usually). Part of the tutorial explains that a string literal is stored on the stack, because it has a fixed (and known) size; it also explains that a non-initialised string is stored on the heap, so that its size can grow as necessary.
My understanding is that the stack is much faster than the heap. In the case of a string whose size is unknown, but I know it will not ever require more than n bytes, does it make sense to allocate space on the stack for the maximum size, instead of sticking it on the heap?
The purpose of this question is not to solve a problem, but to help me understand, so I would appreciate verbose and detailed answers!

The difference in performance between the stack and the heap comes due to the fact that objects in the heap may change size at run time, and they must then be reallocated somewhere else in the heap.
Now for the verbose part. Imagine you have an integer i32. This number will always be the same size, so any modifications made to it will occur in place. When it goes out of scope (it stops being needed in the program) it will either be deleted, or, a more efficient solution, it will be deleted along with the whole stack it belongs to.
Now you want to create a String. So you create it in the heap and give it a value. And then you modify it and add some characters to it. Now two things can happen.
There is free memory after the string, so the allocator uses this memory to write the new part.
There already is an object allocated in the memory right after the string, and of course, you don't want to overwrite it. So the allocator looks for the next free memory space with enough size to hold the new string and copies into it that string. Then deletes the old one, freeing that memory.
As you can see in the heap the number of operations to be made is incredibly higher than in the stack, so its performance will be lower.
Now, in your case, there are some methods specifically for memory reservation. String::reserve() and String::reserve_exact(). I would recommend you to look at the documentation for Rust always. Usually there already is a std method for what you want.

Related

why stack pointer is initialized to the maximum value?

why stack pointer is initialized to the maximum value?
I only knows that It is the tiny register which stores the last program request’s address in a stack. It is the particular kind of buffer that stores the information in the order of top-down. but can anyone explain me why initially it's having max value.
Without more detail on the actual microprocessor, there isn't a single exact answer; but in general, each architecture handles stack pointer initialization a bit differently. For example, the version of ARM used in many microprocessors initializes the stack pointer (also R13) from the vector table (the first entry). Other architectures either initialize the register to 0 or some other value; so it's not always all 1s as you mention in your question. If the hardware itself doesn't initialize the stack pointer so somewhere meaningful, some of the first instructions usually does. And usually this value is near or at the top of memory as you mention, the stack typically grows down from higher addresses to lower ones; so a value of all 1s or some other larger value might make sense depending on how memory is laid out and managed.
One thing also worth mentioning is that you say the stack stores the "last program request's address" which if I understand correctly is one of the things the stack stores. In more architectures, the stack can store much more than just the return address of a call but also local variables, saved context when a call is made or a context is swapped (either by an OS or by an exception/interrupt) and anything else the program might want to push onto it.
So the short answer is: it isn't always set to the maximum value but it's usually set to some high value as the stack will grow down to lower addresses as things like data and addresses are pushed on it.

Does the order or syntax of allocate statement affect performance? (Fortran)

Because of having performance issues when passing a code from static to dynamic allocation, I started to wander about how memory allocation is managed in a Fortran code.
Specifically, in this question, I wander if the order or syntax used for the allocate statement makes any difference. That is, does it make any difference to allocate vectors like:
allocate(x(DIM),y(DIM))
versus
allocate(x(DIM))
allocate(y(DIM))
The syntax suggests that in the first case the program would allocate all the space for the vectors at once, possibly improving the performance, while in the second case it must allocate the space for one vector at a time, in such a way that they could end up far from each other. If not, that is, if the syntax does not make any difference, I wander if there is a way to control that allocation (for instance, allocating a vector for all space and using pointers to address the space allocated as multiple variables).
Finally, I notice now that I don't even know one thing: an allocate statement guarantees that at least a single vector occupies a contiguous space in memory (or the best it can?).
From the language standard point of view both ways how to write them are possible. The compiler is free to allocate the arrays where it wants. It normally calls malloc() to allocate some piece of memory and makes the allocatable arrays from that piece.
Whether it might allocate a single piece of memory for two different arrays in a single allocate statement is up to the compiler, but I haven't heard about any compiler doing that.
I just verified that my gfortran just calls __builtin_malloc two times in this case.
Another issue is already pointed out by High Performance Mark. Even when malloc() successfully returns, the actual memory pages might still not be assigned. On Linux that happens when you first access the array.
I don't think it is too important if those arrays are close to each other in memory or not anyway. The CPU can cache arrays from different regions of address space if it needs them.
Is there a way how to control the allocation? Yes, you can overload the malloc by your own allocator which does some clever things. It may be used to have always memory aligned to 32-bytes or similar purposes (example). Whether you will improve performance of your code by allocating things somehow close to each other is questionable, but you can have a try. (Of course this is completely compiler-dependent thing, a compiler doesn't have to use malloc() at all, but mostly they do.) Unfortunately, this will only works when the calls to malloc are not inlined.
There are (at least) two issues here, firstly the time taken to allocate the memory and secondly the locality of memory in the arrays and the impact of this on performance. I don't know much about the actual allocation process, although the links suggested by High Performance Mark and the answer by Vadimir F cover this.
From your question, it seems you are more interested in cache hits and memory locality given by arrays being next to each other. I would guess there is no guarantee either allocate statement ensures both arrays next to each other in memory. This is based on allocating arrays in a type, which in the fortran 2003 MAY 2004 WORKING DRAFT J3/04-007 standard
NOTE 4.20
Unless the structure includes a SEQUENCE statement, the use of this terminology in no way implies that these components are stored in this, or any other, order. Nor is there any requirement that contiguous storage be used.
From the discussion with Vadimir F, if you put allocatable arrays in a type and use the sequence keyword, e.g.
type botharrays
SEQUENCE
double precision, dimension(:), allocatable :: x, y
end type
this DOES NOT ensure they are allocated as adjacent in memory. For static arrays or lots of variables, a sequential type sounds like it may work like your idea of "allocating a vector for all space and using pointers to address the space allocated as multiple variables". I think common blocks (Fortran 77) allowed you to specify the relationship between memory location of arrays and variables in memory, but don't work with allocatable arrays either.
In short, I think this means you cannot ensure two allocated arrays are adjacent in memory. Even if you could, I don't see how this will result in a reduction in cache misses or improved performance. Even if you typically use the two together, unless the arrays are small enough that the cache will include multiple arrays in one read (assuming reads are allowed to go beyond array bounds) you won't benefit from the memory locality.

Reading from a stack and memory allocation at compile time

Objects can be put on and removed only from the top of a stack. But what about reading and writing their values? Please correct me if I'm wrong, but I think process must be able to read from any part of the stack, since if only reading from the top was possible it would have to remove (and store somewhere) whole content of the stack above a variable it wants to examine. But in that case, how does the process know where exactly in the stack is a particular variable? I suspect it just holds a pointer to it, but where is that pointer stored?
Another thing - reading about stacks I often find phrases like "All memory allocated on the stack is known at compile time." Well, I probably misunderstand this, so please tell me where's the flaw in my logic:
Suppose a local variable is created when an if() statement is true, and isn't when it's false. Whether it's true will turn out at run time. So at compile time there's no way to know if it should be created, hence I wouldn't think memory for it is allocated at all, as it would be wasteful. Consequently, it isn't created/known at compile time.
At compile time, it's known how much space each type needs: An Integer, for instance, is 4 Bytes wide on 32 bit platforms, and a class with 2 Integers consumes 8 Bytes. Whether this space is allocated for a specific variable is not necessarily known (may depend on an if, as you stated).
When you invoke a method, all parameters and the return address are pushed onto the stack. To get one parameter, you walk up the stack up to its position, which is computed by the base pointer and the size of each parameter.
So it is not entirely true for this stack that you can access the top element only. It is, however, for the Stack data structure.

Does FastMM support reserving virtual memory and calling in chunks to grow an array?

I know I can reserve virtual memory using VirtualAlloc.
e.g. I can claim 1GB of virtual memory and then call in the first MB of that to put my a growing array into.
When the array grows beyond 1MB I call in the 2nd MB and so on.
This way I don't need to move the array around in memory when it grows, it just stays in place and the Intel/AMD virtual memory manager takes care of my problems.
However does FastMM support this structure, so I don't have to do my own memory management?
Pseudo code:
type
PBigarray = ^TBigarray;
TBigArray = array[0..0] of SomeRecord;
....
begin
VirtualMem:= FastMM.ReserveVirtualMemory(1GB);
PBigArray:= FastMM.ClaimPhysicalMemory(VirtualMem, 1MB);
....
procedure GrowBigArray
begin
FastMM.ClaimMorePhysicalMemory(PBigArray, 1MB {extra});
//will generate OOM exception when claim exceeds 1GB
Does FastMM support this?
No, FastMM4 (as of the latest version I looked at) does not explicitly support this. It's really not a functionality you would expect in a general purpose memory manager as it's trivially simple to do with VirtualAlloc calls.
NexusMM4 (which is part of NexusDB) does something that gives you a similar result, but without wasting all the address space before it is needed in the background.
If you make an initial large allocation (directly via GetMem, or indirectly via a dynamic array or such) the memory is allocated in just the size needed, via VirtualAlloc.
But if that allocation is then resized to a larger size, NexusMM will use a different way to allocate memory which allows it to simply unmap the allocation from the address space an remap it again, at a larger size, when further reallocs takes place.
This prevents the 2 major problems that most general purpose memory managers have when reallocating:
during a normal realloc the existing and new allocation need to be present in the address space at the same time, temporarily doubling the address space and physical memory requirements
during a normal realloc, the whole contents of the existing allocation needs to be copied
So with NexusMM you would get all the advantages of what you showed in your pseudo code (with the exception that the first realloc will involve a copy, and that growing your array might change it's address) by simply using normal GetMem/ReallocMem/FreeMem calls.

What is the purpose of each of the memory locations, stack, heap, etc? (lost in technicalities)

Ok, I asked the difference between Stackoverflow and bufferoverflow yesterday and almost getting voted down to oblivion and no new information.
So it got me thinking and I decided to rephrase my question in the hopes that I get reply which actually solves my issue.
So here goes nothing.
I am aware of four memory segments(correct me if I am wrong). The code, data, stack and heap. Now AFAIK the the code segment stores the code, while the data segment stores the data related to the program. What seriously confuses me is the purpose of the stack and the heap!
From what I have understood, when you run a function, all the related data to the function gets stored in the stack and when you recursively call a function inside a function, inside of a function... While the function is waiting on the output of the previous function, the function and its necessary data don't pop out of the stack. So you end up with a stack overflow. (Again please correct me if I am wrong)
Also I know what the heap is for. As I have read someplace, its for dynamically allocating data when a program is executing. But this raises more questions that solves my problems. What happens when I initially initialize my variables in the code.. Are they in the code segment or in the data segment or in the heap? Where do arrays get stored? Is it that after my code executes all that was in my heap gets erased? All in all, please tell me about heap in a more simplified manner than just, its for malloc and alloc because I am not sure I completely understand what those terms are!
I hope people when answering don't get lost in the technicalities and can keep the terms simple for a layman to understand (even if the concept to be described is't laymanish) and keep educating us with the technical terms as we go along. I also hope this is not too big a question, because I seriously think they could not be asked separately!
What is the stack for?
Every program is made up of functions / subroutines / whatever your language of choice calls them. Almost always, those functions have some local state. Even in a simple for loop, you need somewhere to keep track of the loop counter, right? That has to be stored in memory somewhere.
The thing about functions is that the other thing they almost always do is call other functions. Those other functions have their own local state - their local variables. You don't want your local variables to interfere with the locals in your caller. The other thing that has to happen is, when FunctionA calls FunctionB and then has to do something else, you want the local variables in FunctionA to still be there, and have their same values, when FunctionB is done.
Keeping track of these local variables is what the stack is for. Each function call is done by setting up what's called a stack frame. The stack frame typically includes the return address of the caller (for when the function is finished), the values for any method parameters, and storage for any local variables.
When a second function is called, then a new stack frame is created, pushed onto the top of the stack, and the call happens. The new function can happily work away on its stack frame. When that second function returns, its stack frame is popped (removed from the stack) and the caller's frame is back in place just like it was before.
So that's the stack. So what's the heap? It's got a similar use - a place to store data. However, there's often a need for data that lives longer than a single stack frame. It can't go on the stack, because when the function call returns, it's stack frame is cleaned up and boom - there goes your data. So you put it on the heap instead. The heap is a basically unstructured chunk of memory. You ask for x number of bytes, and you get it, and can then party on it. In C / C++, heap memory stays allocated until you explicitly deallocate. In garbage collected languages (Java/C#/Python/etc.) heap memory will be freed when the objects on it aren't used anymore.
To tackle your specific questions from above:
What's the different between a stack overflow and a buffer overflow?
They're both cases of running over a memory limit. A stack overflow is specific to the stack; you've written your code (recursion is a common, but not the only, cause) so that it has too many nested function calls, or you're storing a lot of large stuff on the stack, and it runs out of room. Most OS's put a limit on the maximum size the stack can reach, and when you hit that limit you get the stack overflow. Modern hardware can detect stack overflows and it's usually doom for your process.
A buffer overflow is a little different. So first question - what's a buffer? Well, it's a bounded chunk of memory. That memory could be on the heap, or it could be on the stack. But the important thing is you have X bytes that you know you have access to. You then write some code that writes X + more bytes into that space. The compiler has probably already used the space beyond your buffer for other things, and by writing too much, you've overwritten those other things. Buffer overruns are often not seen immediately, as you don't notice them until you try to do something with the other memory that's been trashed.
Also, remember how I mentioned that return addresses are stored on the stack too? This is the source of many security issues due to buffer overruns. You have code that uses a buffer on the stack and has an overflow vulnerability. A clever hacker can structure the data that overflows the buffer to overwrite that return address, to point to code in the buffer itself, and that's how they get code to execute. It's nasty.
What happens when I initially initialize my variables in the code.. Are they in the code segment or in the data segment or in the heap?
I'm going to talk from a C / C++ perspective here. Assuming you've got a variable declaration:
int i;
That reserves (typically) four bytes on the stack. If instead you have:
char *buffer = malloc(100);
That actually reserves two chunks of memory. The call to malloc allocates 100 bytes on the heap. But you also need storage for the pointer, buffer. That storage is, again, on the stack, and on a 32-bit machine will be 4 bytes (64-bit machine will use 8 bytes).
Where do arrays get stored...???
It depends on how you declare them. If you do a simple array:
char str[128];
for example, that'll reserve 128 bytes on the stack. C never hits the heap unless you explicitly ask it to by calling an allocation method like malloc.
If instead you declare a pointer (like buffer above) the storage for the pointer is on the stack, the actual data for the array is on the heap.
Is it that after my code executes all that was in my heap gets erased...???
Basically, yes. The OS will clean up the memory used by a process after it exits. The heap is a chunk of memory in your process, so the OS will clean it up. Although it depends on what you mean by "clean it up." The OS marks those chunks of RAM as now free, and will reuse it later. If you had explicit cleanup code (like C++ destructors) you'll need to make sure those get called, the OS won't call them for you.
All in all, please tell me about heap in a more simplified manner than just, its for malloc and alloc?
The heap is, much like it's name, a bunch of free bytes that you can grab a piece at a time, do whatever you want with, then throw back to use for something else. You grab a chunk of bytes by calling malloc, and you throw it back by calling free.
Why would you do this? Well, there's a couple of common reasons:
You don't know how many of a thing
you need until run time (based on
user input, for example). So you
dynamically allocate on the heap as
you need them.
You need large data structures. On
Windows, for example, a thread's
stack is limited by default to 1
meg. If you're working with large
bitmaps, for example, that'll be a
fast way to blow your stack and get
a stack overflow. So you grab that
space of the heap, which is usually
much, much larger than the stack.
The code, data, stack and heap?
Not really a question, but I wanted to clarify. The "code" segment contains the executable bytes for your application. Typically code segments are read only in memory to help prevent tampering. The data segment contains constants that are compiled into the code - things like strings in your code or array initializers need to be stored somewhere, the data segment is where they go. Again, the data segment is typically read only.
The stack is a writable section of memory, and usually has a limited size. The OS will initialize the stack and the C startup code calls your main() function for you. The heap is also a writable section of memory. It's reserved by the OS, and functions like malloc and free manage getting chunks out of it and putting them back.
So, that's the overview. I hope this helps.
With respect to stack... This is precicely where the parameters and local variables of the functions / procedures are stored. To be more precise, the params and local variables of the currently executing function is only accessible from the stack... Other variables that belong to chain of functions that were executed before it will be in stack but will not be accessible until the current function completed its operations.
With respect global variables, I believe these are stored in data segment and is always accessible from any function within the created program.
With respect to Heap... These are additional memories that can be made allotted to your program whenever you need them (malloc or new)... You need to know where the allocated memory is in heap (address / pointer) so that you can access it when you need. Incase you loose the address, the memory becomes in-accessible, but the data still remains there. Depending on the platform and language this has to be either manually freed by your program (or a memory leak occurs) or needs to be garbage collected. Heap is comparitively huge to stack and hence can be used to store large volumes of data (like files, streams etc)... Thats why Objects / Files are created in Heap and a pointer to the object / file is stored in stack.
In terms of C/C++ programs, the data segment stores static (global) variables, the stack stores local variables, and the heap stores dynamically allocated variables (anything you malloc or new to get a pointer to). The code segment only stores the machine code (the part of your program that gets executed by the CPU).

Resources