What factors affect the address of the stack on Linux? - memory

ASLR is one. What about if it's disabled? Is the start address of the stack the same across distros, kernel versions, what?
Does having different environmental vars change the addresses of objects further up the stack?

I'm not about to go comparing many different kernel versions, but in the latest development kernel (4.0), the loader for ELF binaries (load_elf_binary in fs/binfmt_elf.c) initializes the user stack at an architecture-dependent address (defined by the STACK_TOP constant) if ASLR is disabled. On some architectures, STACK_TOP is a fixed address, but on many, it depends on the "personality" (or "execution domain") of the current process.
Yes, the environment variables and command-line arguments are pushed on the user stack before the loaded program is run, so they will affect the user stack pointer as seen by the process right after exec.

Related

Does an Operating System check every Instruction?

Not sure if anyone here can answer this.
I've learned that an Operating System checks if an instruction of a program changes something outside of its allocated memory, and if it does then the OS won't allow the program to do this.
But, if the OS has to check this for every instruction, won't this take up at least 5/6 of the CPU? I tried to replicate this, and this is how many clock cycles I've come up with to check this for every instruction.
If I've understood something wrong, please correct me, because I can't imagine that an OS takes up that much of the CPU.
There are several safe-guards in place to ensure a non-privileged process behaves. I will discuss two of them in the context of the x86_64 architecture, but these concepts (mostly) extend to other major platforms.
Privilege Levels
There is a bit in a particular CPU register that indicates the current privilege level. These privileges are often called rings, where ring 0 corresponds to the kernel (ie. highest privilege), and ring 3 corresponds to a userspace process (ie. lowest privilege). There are other rings, but they're not relevant to this introduction.
Certain instructions in x86_64 may only be executed by privileged processes. The current ring must be 0 to execute a privileged instruction. If you try to execute this instruction without the correct privileges, the processor raises a general protection fault. The kernel synchronously processes this interrupt, and will almost certainly kill the userspace process.
The ring level can only be changed while in ring 0, so the userspace process can't simply change from ring 3 to ring 0 by itself.
Execute Permission in Page Tables
All instructions to be executed are stored in memory. Many architectures (including x86_64) use page tables to store mappings from virtual addresses to physical addresses. These page tables have several bookkeeping entries as well, one of which is an execute permission bit. If this bit is not set for a page that corresponds to the instruction trying to be executed, then the processor will produce a general protection fault. As before, the kernel will synchronously process this interrupt, and likely kill the offending process.
When are these execute bits set? They can be dynamically set via mmap(2), but in most cases the compiler emits special CODE sections in the binaries it generates, and when the OS loads the binary into memory it sets the execute bit in the page table entries for the pages that correspond to the CODE sections.
Who's checking these bits?
You're right to ask about the performance penalty of an OS checking these bits for every single instruction. If the OS were doing this, it would be prohibitively expensive. Instead, the processor supports privilege levels and page tables (with the execute bit). The OS can set these bits, and rely on the processor to generate interrupts when a process acts outside its privileges.
These hardware checks are very fast.

How is ELF entry point address handled on systems without virtual memory?

On systems with MMU and support for virtual memory there can be
multiple ELFs with the same entry point loaded at the same time as
their entry points are just virtual addresses and are translated to
physical memory address at runtime. For example, on my amd64
machine .text section is always mapped at address 0x00400000 and
_start is always close to that address. But how does it work on systems
without MMU? Many of them probably don't support multitasking at
all. Is it developers' responsibility to pick ELF entry points by
hand so that they don't overlap?
Short story, it does not handled anyhow.
And now there is a long story.
ELF is not executable as is, it's sort of container with executable data. In systems with OS and MMU, OS creates a process and respective MMU page tables for that process. After that read ELF and copy executable code and data sections (BSS) into this newly allocated process memory according to data in ELF file. Once that done, program counter is set to entry point address. You program runs. Hooray.
Important point to highlight is that every process has it's own virtual memory which is mapped into unique physical memory. So virtual addresses could overlap or be the same for different processes, while physical addresses are always different at any given moment.
Now there is a system without MMU. So there is no virtual memory, every process should be placed in unique physical memory region and linked to that precise memory region.
In real life if there is some small system w/o MMU ELF files simply not used. Option 1, all apps linked together into one big binary. Option 2, apps linked separately with unique addresses, executable information is extracted with some 'objcopy' util, and that binaries copied into system for execution. 'OS' should have a list of entry point to start these 'processes'.

Random memory address

I'm working on a virtual machine under Debian with EGLIBC 2.13 in order to learn memory address.
So I wrote a simple code giving me the address of a test variable, but everytime I exec this script, I'm getting a totally different address.
Here's two screens from 2 distincts executions :
What's causing this ? The fact I'm working on a VM or my GLIBC version ?
I guess it's GLIBC to prevent buffer overflow but I can't find my answer on the web.
And is it totally random ?
First, Glib (from GTK) is not GNU libc (a.k.a. glibc)
Then, you are observing the effect of ASLR (address space layout randomization). Don't try to disable it on a server directly connected to the Internet, it is a valuable security measure.
ASLR is mostly provided by the Linux kernel (e.g. when handing mmap(2) without MAP_FIXED, as most implementations of malloc do, and probably also at execve(2) time for the initial stack). Changing your libc (e.g. to musl-libc) won't disable it.
You could disable system-wide ASLR on a laptop (or on a Linux system running inside some VM) using proc(5): run
echo 0 > /proc/sys/kernel/randomize_va_space
as root. Be careful, by doing that you are decreasing the security of your system.
I don't know what you call totally random, but ASLR is random enough. IIRC, (but I might be wrong) the middle 32 bits of the 64-bits address (assuming a 64 bits Linux system) are quite random, to the point of making result of mmap (hence of malloc using it) practically unpredictable and non-reproducible.
BTW, to see ASLR in practice, try several times (with ASLR enabled) the following command
cat /proc/self/maps
this command displays a textual representation of the address space (in virtual memory) of the process running that cat command. You'll see different outputs when you run it several times !
For debugging memory leaks, use valgrind. With a recent GCC 4.9 or better (or recent Clang/LLVM) compiler, the address sanitizer is also useful, so you could compile with gcc then -Wall -Wextra to get all the warnings even the extra ones, then -g to get debug info, then -fsanitize=address

When do memory addresses get assigned?

Consider the following CPU instruction which takes the memory at address 16777386 (decimal) and stores it in Register 1:
Move &0x010000AA, R1
Traditionally programs are translated to assembly (machine code) at compile time. (Let's ignore more complex modern systems like jitting).
However, if this address allocation is completed statically at compile time, how does the OS ensure that two processes do not use the same memory? (eg if you ran the same compiled program twice concurrently).
Question:
How, and when, does a program get its memory addresses assigned?
Virtual Memory:
I understand most (if not all) modern systems use Memory Management Units in hardware to allow for the use of virtual memory. The first few octets of an address space being used to reference which page. This would allow for memory protection if each process used different pages. However, if this is how memory protection is enforced, the original question still persists, only this time with how page numbers are assigned?
EDIT:
CPU:
One possibility is the CPU can handle memory protection by enforcing that a process id be assigned by the OS before executing memory based instructions. However, this is only speculation, and requires support in hardware by the CPU architecture, something I'm not sure RISC ISAs would be designed to do.
With virtual memory each process has separate address space, so 0x010000AA in one process will refer to different value than in another process.
Address spaces are implemented with kernel-controlled page tables that processor uses to translate virtual page addresses to physical ones. Having two processes using the same address page number is not an issue, since the processes have separate page tables and physical memory mapped can be different.
Usually executable code and global variables will be mapped statically, stack will be mapped at random address (some exploits are more difficult that way) and dynamic allocation routines will use syscalls to map more pages.
(ignoring the Unix fork) The initial state of a processes memory is set up by the executable loader. The linker defines the initial memory state and the loader creates it. That state usually includes memory to static data, executable code, writeable data, and the stack.
In most systems a process can modify the address space by adding pages (possibly removing them as well).
[Ignoring system addresses] In virtual (logical) memory systems each process has an address space starting at zero (usually the first page is not mapped). The address space is divided into pages. The operating system maps (and remaps) logical pages to physical pages.
Address 0x010000AA in one process is then a difference physical memory address in each process.

Write protect a stack page on AIX?

I've a program where argv[0] gets overwritten from time to time. This happens (only) on a production machine which I cannot access and where I cannot use a debugger. In order to find the origin of this corruption, I'd like to write protect this stack page, so that any write access would be turned in a fault, and I could get the address of the culprit instruction.
The system is an AIX 5.3 64 bits based. When I try to invoke mprotect on my stack page, I get an ENOMEM error. I'm using gcc to generate my program.
On a Linux system (x86 based) I can set a similar protection using mprotect without trouble.
Is there any way to achieve this on AIX. Or is this a hopeless attempt?
On AIX, mprotect() requires that requested pages be shared memory or memory mapped files only. On AIX 6.1 and later, you can extend this to the text region, shared libraries, etc, with the MPROTECT_TXT environment variable.
You can however use the -qstackprotect option on XLC 11/AIX 6.1TL4 and later. "Stack Smashing Protection" is designed to protect against exactly the situation you're describing.
On AIX 5.3, my only suggestion would be to look into building with a toolset like Parasoft's Insure++. It would locate errant writes to your stack at runtime. It's pretty much the best (and now only) tool in the business for AIX development. We use it in house and its invaluable when you need it.
For the record, a workaround for this problem is to move processing over to a pthread thread. On AIX, pthread thread stacks live in the data segment which can be mprotected (as opposed to the primordial thread, which cannot be mprotected). This is the way the JVM (OpenJDK) on AIX implements stack guards.

Resources