Direction of stack (higher memory address to lower memory address or from lower memory address to higher memory address) is dependent on machine architecture
Example Intel : higher memory address to lower memory address
SPARC : lower memory address to higher memory address
Is there any way by which we can change the direction of stack memory allocation using code.
Thanks.
In general, management of the stack is performed by the compiler (assuming we're talking about something like C or C++ here). However, the ISA may offer assistance, for instance push and pop instructions on x86.
There is no way to do this from C or C++, unless your compiler offers a non-portable language extension or a command-line option to control this (I can't see why it would, because changing this would make your program/library incompatible with all other programs/libraries!)
Stack is used on machine-instruction level. You cannot change processing unit behavior with code. The only thing one can do is to create program emulation level.
Some processors include explicit circuitry which pushes things onto a stack and pops them in various circumstances. Other processors do not include any such circuitry for a 'big' stack, but just provide a limited number of hardware registers or circuits that are used to store things like return addresses, and possibly a means by which software can copy the addresses stored in those registers or circuits to other parts of memory.
On processors whose hardware doesn't explicitly manipulate a stack in memory, one could use whatever pattern one wanted if one had control over all the code the processor would execute. Generally, however, there will be a pattern that the processor manufacturer recommends for implementing a stack, and code generated by compilers or by other people will most likely expect to use a stack implemented in that fashion.
Related
This question already has answers here:
Why do stacks typically grow downwards?
(11 answers)
Closed 5 years ago.
I have a stupid doubt about something related to memory. My doubt is: Why in memory, the higher addresses are considered in the "bottom", and the lowest addresses are considered in the "top"? Im going to explain in more detail:
The stack memory starts in high addresses and grows to lower addresses. So far this is what I understood, but why does the stack grow "up"? Why are the lower addresses located in the top of the memory?
I've seen various, contradictory memory structures: ones which consider the lower addresses at the bottom of the memory, and ones which consider the lower addresses at the top of the memory. Does it depend on the processor?
Thank you in advance.
It depends upon the instruction set architecture and the ABI. They both influence the organization (and growth direction) of the call stack.
Usually the call stack grows downwards, but there have been ISAs where that is not the case (see this). And some ISAs (e.g. IBM serie Z mainframes) don't have any hardware call stack (then, the call stack is just an ABI convention about register usage).
Most application software (e.g. your game, word processor, compiler, ...) are running above some operating system, in some process having some virtual address space (so in virtual memory).
Read some book on OSes, e.g. Operating Systems: Three Easy Pieces.
In practice (unless you code some operating system kernel managing virtual memory) you care mostly about the virtual address space (often made of numerous discontinuous segments). On Linux, use /proc/ (see proc(5)) to explore it (e.g. try cat /proc/$$/maps in your terminal). And notice that for a multi-threaded application each thread has its own call stack. Then "top" or "bottom" of the virtual address space don't really matter and don't have much sense.
If (as most people) you are writing (in any programming language other than the assembler) some application software above some OS, you don't care (as a developer) about real memory, but about virtual memory and resident set size. You don't care about the stack growth (it is managed by the OS, compiler, ISA, ... for the automatic variables of your code). You need to avoid stack overflow. It often happens that some pages (e.g. of your code segment[s]), perhaps those containing never used code, never get into RAM and stay paged out. And in practice most of the (virtual) memory of some process is not for its call stack: you usually allocate memory in the heap. My firefox browser (on my Linux desktop) has a virtual address space of 2.3 gigabytes (in more than a thousand of segments), but only 124 kilobytes of stack. Read about memory management. The call stack is often limited (e.g. to a few megabytes).
Ok I have a bit of a noob student question.
So I'm familiar with the fact that stacks contain subroutine calls, and heaps contain variable length data structures, and global static variables are assigned to permanant memory locations.
But how does it all work on a less theoretical level?
Does the compiler just assume it's got an entire memory region to itself from address 0 to address infinity? And then just start assigning stuff?
And where does it layout the instructions, stack, and heap? At the top of the memory region, end of memory region?
And how does this then work with virtual memory? The virtual memory is transparent to the program?
Sorry for a bajilion questions but I'm taking programming language structures and it keeps referring to these regions and I want to understand them on a more practical level.
THANKS much in advance!
A comprehensive explanation is probably beyond the scope of this forum. Entire texts are devoted to the subject. However, at a simplistic level you can look at it this way.
The compiler does not lay out the code in memory. It does assume it has the entire memory region to itself. The compiler generates object files where the symbols in the object files typically begin at offset 0.
The linker is responsible for pulling the object files together, linking symbols to their new offset location within the linked object and generating the executable file format.
The linker doesn't lay out code in memory either. It packages code and data into sections typically labeled .text for the executable code instructions and .data for things like global variables and string constants. (and there are other sections as well for different purposes) The linker may provide a hint to the operating system loader where to relocate symbols but the loader doesn't have to oblige.
It is the operating system loader that parses the executable file and decides where code and data are layed out in memory. The location of which depends entirely on the operating system. Typically the stack is located in a higher memory region than the program instructions and data and grows downward.
Each program is compiled/linked with the assumption it has the entire address space to itself. This is where virtual memory comes in. It is completely transparent to the program and managed entirely by the operating system.
Virtual memory typically ranges from address 0 and up to the max address supported by the platform (not infinity). This virtual address space is partitioned off by the operating system into kernel addressable space and user addressable space. Say on a hypothetical 32-bit OS, the addresses above 0x80000000 are reserved for the operating system and the addresses below are for use by the program. If the program tries to access memory above this partition it will be aborted.
The operating system may decide the stack starts at the highest addressable user memory and grows down with the program code located at a much lower address.
The location of the heap is typically managed by the run-time library against which you've built your program. It could live beginning with the next available address after your program code and data.
This is a wide open question with lots of topics.
Assuming the typical compiler -> assembler -> linker toolchain. The compiler doesnt know a whole lot, it simply encodes stack relative stuff, doesnt care how much or where the stack is, that is the purpose/beauty of a stack, dont care. The compiler generates assembler the assembler is assembled into an object, then the linker takes info linker script of some flavor or command line arguments that tell it the details of the memory space, when you
gcc hello.c -o hello
your installation of binutils has a default linker script which is tailored to your target (windows, mac, linux, whatever you are running on). And that script contains the info about where the program space starts, and then from there it knows where to start the heap (after the text, data and bss). The stack pointer is likely set either by that linker script and/or the os manages it some other way. And that defines your stack.
For an operating system with an mmu, which is what your windows and linux and mac and bsd laptop or desktop computers have, then yes each program is compiled assuming it has its own address space starting at 0x0000 that doesnt mean that the program is linked to start running at 0x0000, it depends on the operating system as to what that operating systems rules are, some start at 0x8000 for example.
For a desktop like application where it is somewhat a single linear address space from your programs perspective you will likely have .text first then either .data or .bss and then after all of that the heap will be aligned at some point after that. The stack however it is set is typically up high and works down but that can be processor and operating system specific. that stack is typically within the programs view of the world the top of its memory.
virtual memory is invisible to all of this the application normally doesnt know or care about virtual memory. if and when the application fetches an instruction or does a data tranfer it goes through hardware which is configured by the operating system and that converts between virtual and physical. If the mmu indicates a fault, meaning that space has not been mapped to a physical address, that can sometimes be intentional and then another use of the term "Virtual memory" applies. This second definition the operating system can then for example take some other chunk of memory, yours or someone elses, move that to hard disk for example, mark that other chunk as not being there, and then mark your chunk as having some ram then let you execute not knowing you were interrupted with some ram that you didnt know you had to take from someone else. Your application by design doesnt want to know any of this, it just wants to run, the operating system takes care of managing physical memory and the mmu that gives you a virtual (zero based) address space...
If you were to do a little bit of bare metal programming, without mmu stuff at first then later with, microcontroller, qemu, raspberry pi, beaglebone, etc you can get your hands dirty both with the compiler, linker script and configuring an mmu. I would use an arm or mips for this not x86, just to make your life easier, the overall big picture all translates directly across targets.
It depends.
If you're compiling a bootloader, which has to start from scratch, you can assume you've got the entire memory for yourself.
On the other hand, if you're compiling an application, you can assume you've got the entire memory for yourself.
The minor difference is that in the first case, you have all physical memory for yourself. As a bootloader, there's nothing else in RAM yet. In the second case, there's an OS in memory, but it will (normally) set up virtual memory for you so that it appears you have the entire address space for yourself. Usuaully you still have to ask the OS for actual memory, though.
The latter does mean that the OS imposes some rules. E.g. the OS very much would like to know where the first instruction of your program is. A simple rule might be that your program always starts at address 0, so the C compiler could put int main() there. The OS typically would like to know where the stack is, but this is already a more flexible rule. As far as "the heap" is concerned, the OS really couldn't care.
I have a few question about stack.
Is stack in CPU or RAM?
Is stack a place to run OPcode?
Is EIP in CPU or RAM?
Stack is always in RAM. There is a stack pointer that is kept in a register in CPU that points to the top of stack, i.e., the address of the location at the top of stack.
The stack is found within the RAM and not within the CPU. A segment is dedicated for the stack as seen in the following diagram:
From Wiki:
The stack area contains the program stack, a LIFO structure, typically
located in the higher parts of memory.
Which CPU are you talking about?
Some might contain memory that is used for callstacks, some contain memory that can be used for callstacks but require the OS to implement the callstack management code, and others contain no writable memory at all. For example, the x86 architecture tends to have one or more code caches and data caches built into the CPU.
Some CPUs or OSes implement operations that make specific areas of memory non-executable. To prevent stack-based buffer overflows, for example, many OSes use hardware and/or software-based data execution prevention, which might prevent stack memory from being executed as code. Some don't; It's entirely possible that an x86 CPU data cache line might be used to store both the callstack and code to be executed in faster memory.
EIP sounds like a register for the IA32 CPU architecture. If you're referring to IA-32, then yes, it's a CPU operation, though many OSes will switch it to/from RAM to emulate multi-tasking.
In modern architectures stack is mapped in ram.
Programming languages such ar C, C++, Pascal can allocate memory in ram, this is called Heap allocation, and other variables which live withing functions are stack allocated.
This dictated processors and operating systems to consider stack mapped within ram segment. And for processors with Memory Management Unit this can be anywhere in the ram. However, intel 8080 had a state bit indicating when it reads/writes from stack, thus stack could be implemented physically isolated from RAM. It is not known to me if such machine was implemented, but think of the situation, what memory does a C pointer points to, Heap or Stack.
Should Stack separation gain popularity we should have stack pointer and heap pointer in modern programming languages.
In modern-day operating systems, memory is available as an abstracted resource. A process is exposed to a virtual address space (which is independent from address space of all other processes) and a whole mechanism exists for mapping any virtual address to some actual physical address.
My doubt is:
If each process has its own address space, then it should be free to access any address in the same. So apart from permission restricted sections like that of .data, .bss, .text etc, one should be free to change value at any address. But this usually gives segmentation fault, why?
For acquiring the dynamic memory, we need to do a malloc. If the whole virtual space is made available to a process, then why can't it directly access it?
Different runs of a program results in different addresses for variables (both on stack and heap). Why is it so, when the environments for each run is same? Does it not affect the amount of addressable memory available for usage? (Does it have something to do with address space randomization?)
Some links on memory allocation (e.g. in heap).
The data available at different places is very confusing, as they talk about old and modern times, often not distinguishing between them. It would be helpful if someone could clarify the doubts while keeping modern systems in mind, say Linux.
Thanks.
Technically, the operating system is able to allocate any memory page on access, but there are important reasons why it shouldn't or can't:
different memory regions serve different purposes.
code. It can be read and executed, but shouldn't be written to.
literals (strings, const arrays). This memory is read-only and should be.
the heap. It can be read and written, but not executed.
the thread stack. There is no reason for two threads to access each other's stack, so the OS might as well forbid that. Moreover, the tread stack can be de-allocated when the tread ends.
memory-mapped files. Any changes to this region should affect a specific file. If the file is open for reading, the same memory page may be shared between processes because it's read-only.
the kernel space. Normally the application should not (or can not) access that region - only kernel code can. It's basically a scratch space for the kernel and it's shared between processes. The network buffer may reside there, so that it's always available for writes, no matter when the packet arrives.
...
The OS might assume that all unrecognised memory access is an attempt to allocate more heap space, but:
if an application touches the kernel memory from user code, it must be killed. On 32-bit Windows, all memory above 1<<31 (top bit set) or above 3<<30 (top two bits set) is kernel memory. You should not assume any unallocated memory region is in the user space.
if an application thinks about using a memory region but doesn't tell the OS, the OS may allocate something else to that memory (OS: sure, your file is at 0x12341234; App: but I wanted to store my data there). You could tell the OS by touching the end of your array (which is unreliable anyways), but it's easier to just call an OS function. It's just a good idea that the function call is "give me 10MB of heap", not "give me 10MB of heap starting at 0x12345678"
If the application allocates memory by using it then it typically does not de-allocate at all. This can be problematic as the OS still has to hold the unused pages (but the Java Virtual Machine does not de-allocate either, so hey).
Different runs of a program results in different addresses for variables
This is called memory layout randomisation and is used, alongside of proper permissions (stack space is not executable), to make buffer overflow attacks much more difficult. You can still kill the app, but not execute arbitrary code.
Some links on memory allocation (e.g. in heap).
Do you mean, what algorithm the allocator uses? The easiest algorithm is to always allocate at the soonest available position and link from each memory block to the next and store the flag if it's a free block or used block. More advanced algorithms always allocate blocks at the size of a power of two or a multiple of some fixed size to prevent memory fragmentation (lots of small free blocks) or link the blocks in a different structures to find a free block of sufficient size faster.
An even simpler approach is to never de-allocate and just point to the first (and only) free block and holds its size. If the remaining space is too small, throw it away and ask the OS for a new one.
There's nothing magical about memory allocators. All they do is to:
ask the OS for a large region and
partition it to smaller chunks
without
wasting too much space or
taking too long.
Anyways, the Wikipedia article about memory allocation is http://en.wikipedia.org/wiki/Memory_management .
One interesting algorithm is called "(binary) buddy blocks". It holds several pools of a power-of-two size and splits them recursively into smaller regions. Each region is then either fully allocated, fully free or split in two regions (buddies) that are not both fully free. If it's split, then one byte suffices to hold the size of the largest free block within this block.
I'm studying up on OS memory management, and I wish to verify that I got the basic mechanism of allocation \ virtual memory \ paging straight.
Let's say a process calls malloc(), what happens behind the scenes?
my answer: The runtime library finds an appropriately sized block of memory in its virtual memory address space.
(This is where allocation algorithms such as first-fit, best-fit that deal with fragmentation come into play)
Now let's say the process accesses that memory, how is that done?
my answer: The memory address, as seen by the process, is in fact virtual. The OS checks if that address is currently mapped to a physical memory address and if so performs the access. If it isn't mapped - a page fault is raised.
Am I getting this straight? i.e. the compiler\runtime library are in charge of allocating virtual memory blocks, and the OS is in charge of a mapping between processes' virtual address and physical addresses (and the paging algorithm that entails)?
Thanks!
About right. The memory needs to exist in the virtual memory of the process for a page fault to actually allocate a physical page though. You can't just start poking around anywhere and expect the kernel to put physical memory where you happen to access.
There is much more to it than this. Read up on mmap(), anonymous and not, shared and private. And brk() too. malloc() builds on brk() and mmap().
You've almost got it. The one thing you missed is how the process asks the system for more virtual memory in the first place. As Thomas pointed out, you can't just write where you want. There's no reason an OS couldn't be designed to allow that, but it's much more efficient if it has some idea where you're going to be writing and the space where you do it is contiguous.
On Unixy systems, userland processes have a region called the data segment, which is what it sounds like: it's where the data goes. When a process needs memory for data, it calls brk(), which asks the system to extend the data segment to a specified pointer value. (For example, if your existing data segment was empty and you wanted to extend it to 2M, you'd call brk(0x200000).)
Note that while very common, brk() is not a standard; in fact it was yanked out of POSIX.1 a decade ago because C specifies malloc() and there's no reason to mandate the interface for data segment allocation.