memory management and segmentation faults in modern day systems (Linux) - memory

In modern-day operating systems, memory is available as an abstracted resource. A process is exposed to a virtual address space (which is independent from address space of all other processes) and a whole mechanism exists for mapping any virtual address to some actual physical address.
My doubt is:
If each process has its own address space, then it should be free to access any address in the same. So apart from permission restricted sections like that of .data, .bss, .text etc, one should be free to change value at any address. But this usually gives segmentation fault, why?
For acquiring the dynamic memory, we need to do a malloc. If the whole virtual space is made available to a process, then why can't it directly access it?
Different runs of a program results in different addresses for variables (both on stack and heap). Why is it so, when the environments for each run is same? Does it not affect the amount of addressable memory available for usage? (Does it have something to do with address space randomization?)
Some links on memory allocation (e.g. in heap).
The data available at different places is very confusing, as they talk about old and modern times, often not distinguishing between them. It would be helpful if someone could clarify the doubts while keeping modern systems in mind, say Linux.
Thanks.

Technically, the operating system is able to allocate any memory page on access, but there are important reasons why it shouldn't or can't:
different memory regions serve different purposes.
code. It can be read and executed, but shouldn't be written to.
literals (strings, const arrays). This memory is read-only and should be.
the heap. It can be read and written, but not executed.
the thread stack. There is no reason for two threads to access each other's stack, so the OS might as well forbid that. Moreover, the tread stack can be de-allocated when the tread ends.
memory-mapped files. Any changes to this region should affect a specific file. If the file is open for reading, the same memory page may be shared between processes because it's read-only.
the kernel space. Normally the application should not (or can not) access that region - only kernel code can. It's basically a scratch space for the kernel and it's shared between processes. The network buffer may reside there, so that it's always available for writes, no matter when the packet arrives.
...
The OS might assume that all unrecognised memory access is an attempt to allocate more heap space, but:
if an application touches the kernel memory from user code, it must be killed. On 32-bit Windows, all memory above 1<<31 (top bit set) or above 3<<30 (top two bits set) is kernel memory. You should not assume any unallocated memory region is in the user space.
if an application thinks about using a memory region but doesn't tell the OS, the OS may allocate something else to that memory (OS: sure, your file is at 0x12341234; App: but I wanted to store my data there). You could tell the OS by touching the end of your array (which is unreliable anyways), but it's easier to just call an OS function. It's just a good idea that the function call is "give me 10MB of heap", not "give me 10MB of heap starting at 0x12345678"
If the application allocates memory by using it then it typically does not de-allocate at all. This can be problematic as the OS still has to hold the unused pages (but the Java Virtual Machine does not de-allocate either, so hey).
Different runs of a program results in different addresses for variables
This is called memory layout randomisation and is used, alongside of proper permissions (stack space is not executable), to make buffer overflow attacks much more difficult. You can still kill the app, but not execute arbitrary code.
Some links on memory allocation (e.g. in heap).
Do you mean, what algorithm the allocator uses? The easiest algorithm is to always allocate at the soonest available position and link from each memory block to the next and store the flag if it's a free block or used block. More advanced algorithms always allocate blocks at the size of a power of two or a multiple of some fixed size to prevent memory fragmentation (lots of small free blocks) or link the blocks in a different structures to find a free block of sufficient size faster.
An even simpler approach is to never de-allocate and just point to the first (and only) free block and holds its size. If the remaining space is too small, throw it away and ask the OS for a new one.
There's nothing magical about memory allocators. All they do is to:
ask the OS for a large region and
partition it to smaller chunks
without
wasting too much space or
taking too long.
Anyways, the Wikipedia article about memory allocation is http://en.wikipedia.org/wiki/Memory_management .
One interesting algorithm is called "(binary) buddy blocks". It holds several pools of a power-of-two size and splits them recursively into smaller regions. Each region is then either fully allocated, fully free or split in two regions (buddies) that are not both fully free. If it's split, then one byte suffices to hold the size of the largest free block within this block.

Related

Operating Systems: Processes, Pagination and Memory Allocation doubts

I have several doubts about processes and memory management. List the main. I'm slowly trying to solve them by myself but I would still like some help from you experts =).
I understood that the data structures associated with a process are more or less these:
text, data, stack, kernel stack, heap, PCB.
If the process is created but the LTS decides to send it to secondary memory, are all the data structures copied for example on SSD or maybe just text and data (and PCB in kernel space)?
Pagination allows you to allocate processes in a non-contiguous way:
How does the kernel know if the process is trying to access an illegal memory area? After not finding the index on the page table, does the kernel realize that it is not even in virtual memory (secondary memory)? If so, is an interrupt (or exception) thrown? Is it handled immediately or later (maybe there was a process switch)?
If the processes are allocated non-contiguously, how does the kernel realize that there has been a stack overflow since the stack typically grows down and the heap up? Perhaps the kernel uses virtual addresses in PCBs as memory pointers that are contiguous for each process so at each function call it checks if the VIRTUAL pointer to the top of the stack has touched the heap?
How do programs generate their internal addresses? For example, in the case of virtual memory, everyone assumes starting from the address 0x0000 ... up to the address 0xffffff ... and is it then up to the kernel to proceed with the mapping?
How did the processes end? Is the system call exit called both in case of normal termination (finished last instruction) and in case of killing (by the parent process, kernel, etc.)? Does the process itself enter kernel mode and free up its associated memory?
Kernel schedulers (LTS, MTS, STS) when are they invoked? From what I understand there are three types of kernels:
separate kernel, below all processes.
the kernel runs inside the processes (they only change modes) but there are "process switching functions".
the kernel itself is based on processes but still everything is based on process switching functions.
I guess the number of pages allocated the text and data depend on the "length" of the code and the "global" data. On the other hand, is the number of pages allocated per heap and stack variable for each process? For example I remember that the JVM allows you to change the size of the stack.
When a running process wants to write n bytes in memory, does the kernel try to fill a page already dedicated to it and a new one is created for the remaining bytes (so the page table is lengthened)?
I really thank those who will help me.
Have a good day!
I think you have lots of misconceptions. Let's try to clear some of these.
If the process is created but the LTS decides to send it to secondary memory, are all the data structures copied for example on SSD or maybe just text and data (and PCB in kernel space)?
I don't know what you mean by LTS. The kernel can decide to send some pages to secondary memory but only on a page granularity. Meaning that it won't send a whole text segment nor a complete data segment but only a page or some pages to the hard-disk. Yes, the PCB is stored in kernel space and never swapped out (see here: Do Kernel pages get swapped out?).
How does the kernel know if the process is trying to access an illegal memory area? After not finding the index on the page table, does the kernel realize that it is not even in virtual memory (secondary memory)? If so, is an interrupt (or exception) thrown? Is it handled immediately or later (maybe there was a process switch)?
On x86-64, each page table entry has 12 bits reserved for flags. The first (right-most bit) is the present bit. On access to the page referenced by this entry, it tells the processor if it should raise a page-fault. If the present bit is 0, the processor raises a page-fault and calls an handler defined by the OS in the IDT (interrupt 14). Virtual memory is not secondary memory. It is not the same. Virtual memory doesn't have a physical medium to back it. It is a concept that is, yes implemented in hardware, but with logic not with a physical medium. The kernel holds a memory map of the process in the PCB. On page fault, if the access was not within this memory map, it will kill the process.
If the processes are allocated non-contiguously, how does the kernel realize that there has been a stack overflow since the stack typically grows down and the heap up? Perhaps the kernel uses virtual addresses in PCBs as memory pointers that are contiguous for each process so at each function call it checks if the VIRTUAL pointer to the top of the stack has touched the heap?
The processes are allocated contiguously in the virtual memory but not in physical memory. See my answer here for more info: Each program allocates a fixed stack size? Who defines the amount of stack memory for each application running?. I think stack overflow is checked with a page guard. The stack has a maximum size (8MB) and one page marked not present is left underneath to make sure that, if this page is accessed, the kernel is notified via a page-fault that it should kill the process. In itself, there can be no stack overflow attack in user mode because the paging mechanism already isolates different processes via the page tables. The heap has a portion of virtual memory reserved and it is very big. The heap can thus grow according to how much physical space you actually have to back it. That is the size of the swap file + RAM.
How do programs generate their internal addresses? For example, in the case of virtual memory, everyone assumes starting from the address 0x0000 ... up to the address 0xffffff ... and is it then up to the kernel to proceed with the mapping?
The programs assume an address (often 0x400000) for the base of the executable. Today, you also have ASLR where all symbols are kept in the executable and determined at load time of the executable. In practice, this is not done much (but is supported).
How did the processes end? Is the system call exit called both in case of normal termination (finished last instruction) and in case of killing (by the parent process, kernel, etc.)? Does the process itself enter kernel mode and free up its associated memory?
The kernel has a memory map for each process. When the process dies via abnormal termination, the memory map is crossed and cleared off of that process's use.
Kernel schedulers (LTS, MTS, STS) when are they invoked?
All your assumptions are wrong. The scheduler cannot be called otherwise than with a timer interrupt. The kernel isn't a process. There can be kernel threads but they are mostly created via interrupts. The kernel starts a timer at boot and, when there is a timer interrupt, the kernel calls the scheduler.
I guess the number of pages allocated the text and data depend on the "length" of the code and the "global" data. On the other hand, is the number of pages allocated per heap and stack variable for each process? For example I remember that the JVM allows you to change the size of the stack.
The heap and stack have portions of virtual memory reserved for them. The text/data segment start at 0x400000 and end wherever they need. The space reserved for them is really big in virtual memory. They are thus limited by the amount of physical memory available to back them. The JVM is another thing. The stack in JVM is not the real stack. The stack in JVM is probably heap because JVM allocates heap for all the program's needs.
When a running process wants to write n bytes in memory, does the kernel try to fill a page already dedicated to it and a new one is created for the remaining bytes (so the page table is lengthened)?
The kernel doesn't do that. On Linux, the libstdc++/libc C++/C implementation does that instead. When you allocate memory dynamically, the C++/C implementation keeps track of the allocated space so that it won't request a new page for a small allocation.
EDIT
Do compiled (and interpreted?) Programs only work with virtual addresses?
Yes they do. Everything is a virtual address once paging is enabled. Enabling paging is done via a control register set at boot by the kernel. The MMU of the processor will automatically read the page tables (among which some are cached) and will translate these virtual addresses to physical ones.
So do pointers inside PCBs also use virtual addresses?
Yes. For example, the PCB on Linux is the task_struct. It holds a field called pgd which is an unsigned long*. It will hold a virtual address and, when dereferenced, it will return the first entry of the PML4 on x86-64.
And since the virtual memory of each process is contiguous, the kernel can immediately recognize stack overflows.
The kernel doesn't recognize stack overflows. It will simply not allocate more pages to the stack then the maximum size of the stack which is a simple global variable in the Linux kernel. The stack is used with push pops. It cannot push more than 8 bytes so it is simply a matter of reserving a page guard for it to create page-faults on access.
however the scheduler is invoked from what I understand (at least in modern systems) with timer mechanisms (like round robin). It's correct?
Round-robin is not a timer mechanism. The timer is interacted with using memory mapped registers. These registers are detected using the ACPI tables at boot (see my answer here: https://cs.stackexchange.com/questions/141870/when-are-a-controllers-registers-loaded-and-ready-to-inform-an-i-o-operation/141918#141918). It works similarly to the answer I provided for USB (on the link I provided here). Round-robin is a scheduler priority scheme often called naive because it simply gives every process a time slice and executes them in order which is not currently used in the Linux kernel (I think).
I did not understand the last point. How is the allocation of new memory managed.
The allocation of new memory is done with a system call. See my answer here for more info: Who sets the RIP register when you call the clone syscall?.
The user mode process jumps into a handler for the system call by calling syscall in assembly. It jumps to an address specified at boot by the kernel in the LSTAR64 register. Then the kernel jumps to a function from assembly. This function will do the stuff the user mode process requires and return to the user mode process. This is often not done by the programmer but by the C++/C implementation (often called the standard library) that is a user mode library that is linked against dynamically.
The C++/C standard library will keep track of the memory it allocated by, itself, allocating some memory and by keeping records. Then, if you ask for a small allocation, it will use the pages it already allocated instead of requesting new ones using mmap (on Linux).

Kernel memory management: where do I begin?

I'm a bit of a noob when it comes to kernel programming, and was wondering if anyone could point me in the right direction for beginning the implementation of memory management in a kernel setting. I am currently working on a toy kernel and am doing a lot of research on the subject but I'm a bit confused on the topic of memory management. There are so many different aspects to it like paging and virtual memory mapping. Is there a specific order that I should implement things or any do's and dont's? I'm not looking for any code or anything, I just need to be pointed in the right direction. Any help would be appreciated.
There are multiple aspects that you should consider separately:
Managing the available physical memory.
Managing the memory required by the kernel and it's data structures.
Managing the virtual memory (space) of every process.
Managing the memory required by any process, i.e. malloc and free.
To be able to manage any of the other memory demands you need to know actually how much physical memory you have available and what parts of it are available to your use.
Assuming your kernel is loaded by a multiboot compatible boot loader you'll find this information in the multiboot header that you get passed (in eax on x86 if I remember correctly) from the boot loader.
The header contains a structure describing which memory areas are used and which are free to use.
You also need to store this information somehow, and keep track of what memory is allocated and freed. An easy method to do so is to maintain a bitmap, where bit N indicates whether the (fixed size S) memory area from N * S to (N + 1) * S - 1 is used or free. Of course you probably want to use more sophisticated methods like multilevel bitmaps or free lists as your kernel advances, but a simple bitmap as above can get you started.
This memory manager usually only provides "large" sized memory chunks, usually multiples of 4KB. This is of course of no use for dynamic memory allocation in style of malloc and free that you're used to from applications programming.
Since dynamic memory allocation will greatly ease implementing advanced features of your kernel (multitasking, inter process communication, ...) you usually write a memory manager especially for the kernel. It provides means for allocation (kalloc) and deallocation (kfree) of arbitrary sized memory chunks. This memory is from pool(s) that are allocated using the physical memory manager from above.
All of the above is happening inside the kernel. You probably also want to provide applications means to do dynamic memory allocation. Implementing this is very similar in concept to the management of physical memory as done above:
A process only sees its own virtual address space. Some parts of it are unusable for the process (for example the area where the kernel memory is mapped into), but most of it will be "free to use" (that is, no actually physical memory is associated with it). As a minimum the kernel needs to provide applications means to allocate and free single pages of its memory address space. Allocating a page results (under the hood, invisible to the application) in a call to the physical memory manager, and in a mapping from the requested page to this newly allocated memory.
Note though that many kernels provide its processes either more sophisticated access to their own address space or directly implement some of the following tasks in the kernel.
Being able to allocate and free pages (4KB mostly) as before doesn't help with dynamic memory management, but as before this is usually handled by some other memory manager which is using these large memory chunks as pool to provide smaller chunks to the application. A prominent example is Doug Lea's allocator. Memory managers like these are usually implemented as library (part of the standard library most likely) that is linked to every application.

memory segments and physical RAM [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
The memory map of a process appears to be fragmented into segments (stack, heap, bss, data, and text),
I was wondering are these segments just an abstraction for the
convenience of the process and the physical RAM is just a linear array
of addresses or is the physical RAM also fragmented into these
segments?
Also if the RAM is not fragmented and is just a linear array then how
does the OS provide the process the abstraction of these segments?
Also how would programming change if the memory map to a process appeared as just a linear array and not divided into segments (with the MMU translating virtual addresses into physical ones)?
In a modern OS supporting virtual memory, it is the address space of the process that is divided into these segments. And in general case that address space of the process is projected onto the physical RAM in a completely random fashion (with some fixed granularity, 4K typically). Address space pages located next to each other do not have to be projected into the neighboring physical pages of RAM. Physical pages of RAM do not have to maintain the same relative order as the process's address space pages. This all means that there is no such separation into segments in RAM and there can't possibly be.
In order to optimize memory access an OS might (and typically will) try to map sequential pages of the process address space to sequential pages in RAM, but that's just an optimization. In general case, the mapping is unpredictable. On top of that the RAM is shared by all processes in the system, with RAM pages belonging to different processes being arbitrarily interleaved in RAM, which eliminates any possibility of having such "segments" in RAM. There's no process-specific ordering or segmentation in RAM. RAM is just a cache for virtual memory mechanism.
Again, every process works with its own virtual address space. This is where these segments can exist. The process has no direct access to RAM. The process doesn't even need to know that RAM exists.
These segments are largely a convenience for the program loader and operating system (though they also provide a basis for coarse-grained protection; execution permission can be limited to text and writes prohibited from rodata).1
The physical memory address space might be fragmented but not for the sake of such application segments. For example, in a NUMA system it might be convenient for hardware to use specific bits to indicate which node owns a given physical address.
For a system using address translation, the OS can somewhat arbitrarily place the segments in physical memory. (With segmented translation, external fragmentation can be a problem; a contiguous range of physical memory addresses may not be available, requiring expensive moving of memory segments. With paged translation, external fragmentation is not a possible. Segmented translation has the advantage of requiring less translation information: each segment requiring only a base and bound with other metadata whereas a memory section would typically have many more than two pages each of which has a base address and metadata.)
Without address translation, placement of segments would necessarily be less arbitrary. Fortunately, most programs do not care about the specific address where segments are placed. (Single address space OSes
(Note that it can be convenient for sharable sections to be in fixed locations. For code this can be used to avoid indirection through a global offset table without requiring binary rewriting in the program loader/dynamic linker. This can also reduce address translation overhead.)
Application-level programming is generally sufficiently abstracted from such segmentation that its existence is not noticeable. However, pure abstractions are naturally unfriendly to intense optimization for physical resource use, including execution time.
In addition, a programming system may choose to use a more complex placement of data (without the application programmer needing to know the implementation details). For example, use of coroutines may encourage using a cactus/spaghetti stack where contiguity is not expected. Similarly, a garbage collecting runtime might provide additional divisions of the address space, not only for nurseries but also for separating leaf objects, which have no references to collectable memory, from non-leaf objects (reducing the overhead of mark/sweep). It is also not especially unusual to provide two stack segments, one for data whose address is not taken (or at least is fixed in size) and one for other data.
1One traditional layout of these segments (with a downward growing stack) in a flat virtual address space for Unix-like OSes places text at the lowest address, rodata immediate above that, initialized data immediately above that, zero-initialized data (bss) immediately above that, heap growing upward from the top of bss, and stack growing downward from the top of the application's portion of the virtual address space.
Having heap and stack growing toward each other allows arbitrary growth of each (for a single thread using that address space!). This placement also allows a program loader to simply copy the program file into memory starting at the lowest address, groups memory by permission, and can sometimes allow a single global pointer to address all of the global/static data range (rodata, data, and bss).
The memory map to a process appears fragmented into segments (stack, heap, bss, data, and text)
That's the basic mapping used by Unix; other operating systems use different schemes. Generally, though, they split the process memory space into separate segments for executing code, stack, data, and heap data.
I was wondering are these segments are just abstraction for the processes for convience and the physical RAM is just a linear array of addresses or the physical RAM is also fragmented into these segments?
Depends.
Yes, these segments are created and managed by the OS for the benefit of the process. But physical memory can be arranged as linear addresses, or banked segments, or non-contiguous blocks of RAM. It's up to the OS to manage the total system memory space so that each process can access its own portion of it.
Virtual memory adds yet another layer of abstraction, so that what looks like linear memory locations are in fact mapped to separate pages of RAM, which could be anywhere in the physical address space.
Also if the RAM is not fragmanted and is just a linear array then how the OS provides the process the abstraction of these segments?
The OS manages all of this by using virtual memory mapping hardware. Each process sees contiguous memory areas for its code, data, stack, and heap segments. But in reality, the OS maps the pages within each of these segments to physical pages of RAM. So two identical running processes will see the same virtual address space composed of contiguous memory segments, but the memory pages comprising these segments will be mapped to entirely different physical RAM pages.
But bear in mind that physical RAM may not actually be one contiguous block of memory, but may in fact be split across multiple non-adjacent blocks or memory banks. It is up to the OS to manage all of this in a way that is transparent to the processes.
Also how the programming would change if the memory map to a process would appear just as a linear array and not divided into segments?, and then the MMU would just translate these virtual addresses into physical ones.
The MMU always operates that way, translating virtual memory addresses into physical memory addresses. The OS sets up and manages the mapping of each page of each segment for each process. Each time the process exceeds its stack allocation, for example, the OS traps a segment fault and adds another page to the process's stack segment, mapping the virtual page to a physical page selected from available memory.
Virtual memory also allows the OS to swap out process pages temporarily to disk, so that the total amount of virtual memory occupied by all of the running processes can easily exceed the actual physical memory RAM space of a system. Only the currently active executing processes actually have access to real physical RAM pages.
I was wondering are these segments are just abstraction for the
processes for convience and the physical RAM is just a linear array of
addresses or the physical RAM is also fragmented into these segments?
This in fact highly depends on architecture. Some will have hardware tools (e.g. descriptor registers for x86) to split the RAM into segments. Others just keep this information in software (OS kernel information for this process). Also some segments information are totally irrelevant on execution, they're used merely for code/data loading (e.g. relocation segments).
Also if the RAM is not fragmanted and is just a linear array then how
the OS provides the process the abstraction of these segments?
Process code never references to segments, he only knows about addresses, so the OS has nothing to abstract.
Also how the programming would change if the memory map to a process
would appear just as a linear array and not divided into segments?,
and then the MMU would just translate these virtual addresses into
physical ones
Programming would not be affected. When you program in C you don't define any of these segments, and code also doesn't reference these segments. These segments are to keep an ordered layout, and don't even need to be the same across OS.

Why do we need virtual memory?

So my understanding is that every process has its own virtual memory space ranging from 0x0 to 0xFF....F. These virtual addresses correspond to addresses in physical memory (RAM). Why is this level of abstraction helpful? Why not just use the direct addresses?
I understand why paging is beneficial, but not virtual memory.
There are many reasons to do this:
If you have a compiled binary, each function has a fixed address in memory and the assembly instructions to call functions have that address hardcoded. If virtual memory didn't exist, two programs couldn't be loaded into memory and run at the same time, because they'd potentially need to have different functions at the same physical address.
If two or more programs are running at the same time (or are being context-switched between) and use direct addresses, a memory error in one program (for example, reading a bad pointer) could destroy memory being used by the other process, taking down multiple programs due to a single crash.
On a similar note, there's a security issue where a process could read sensitive data in another program by guessing what physical address it would be located at and just reading it directly.
If you try to combat the two above issues by paging out all the memory for one process when switching to a second process, you incur a massive performance hit because you might have to page out all of memory.
Depending on the hardware, some memory addresses might be reserved for physical devices (for example, video RAM, external devices, etc.) If programs are compiled without knowing that those addresses are significant, they might physically break plugged-in devices by reading and writing to their memory. Worse, if that memory is read-only or write-only, the program might write bits to an address expecting them to stay there and then read back different values.
Hope this helps!
Short answer: Program code and data required for execution of a process must reside in main memory to be executed, but main memory may not be large enough to accommodate the needs of an entire process.
Two proposals
(1) Using a very large main memory to alleviate any need for storage allocation: it's not feasible due to very high cost.
(2) Virtual memory: It allows processes that may not be entirely in the memory to execute by means of automatic storage allocation upon request. The term virtual memory refers to the abstraction of separating LOGICAL memory--memory as seen by the process--from PHYSICAL memory--memory as seen by the processor. Because of this separation, the programmer needs to be aware of only the logical memory space while the operating system maintains two or more levels of physical memory space.
More:
Early computer programmers divided programs into sections that were transferred into main memory for a period of processing time. As higher level languages became popular, the efficiency of complex programs suffered from poor overlay systems. The problem of storage allocation became more complex.
Two theories for solving the problem of inefficient memory management emerged -- static and dynamic allocation. Static allocation assumes that the availability of memory resources and the memory reference string of a program can be predicted. Dynamic allocation relies on memory usage increasing and decreasing with actual program needs, not on predicting memory needs.
Program objectives and machine advancements in the '60s made the predictions required for static allocation difficult, if not impossible. Therefore, the dynamic allocation solution was generally accepted, but opinions about implementation were still divided.
One group believed the programmer should continue to be responsible for storage allocation, which would be accomplished by system calls to allocate or deallocate memory. The second group supported automatic storage allocation performed by the operating system, because of increasing complexity of storage allocation and emerging importance of multiprogramming.
In 1961, two groups proposed a one-level memory store. One proposal called for a very large main memory to alleviate any need for storage allocation. This solution was not possible due to very high cost. The second proposal is known as virtual memory.
cne/modules/vm/green/defn.html
To execute a process its data is needed in the main memory (RAM). This might not be possible if the process is large.
Virtual memory provides an idealized abstraction of the physical memory which creates the illusion of a larger virtual memory than the physical memory.
Virtual memory combines active RAM and inactive memory on disk to form
a large range of virtual contiguous addresses. implementations usually require hardware support, typically in the form of a memory management
unit built into the CPU.
The main purpose of virtual memory is multi-tasking and running large programmes. It would be great to use physical memory, because it would be a lot faster, but RAM memory is a lot more expensive than ROM.
Good luck!

operating systems memory management - malloc() invocation

I'm studying up on OS memory management, and I wish to verify that I got the basic mechanism of allocation \ virtual memory \ paging straight.
Let's say a process calls malloc(), what happens behind the scenes?
my answer: The runtime library finds an appropriately sized block of memory in its virtual memory address space.
(This is where allocation algorithms such as first-fit, best-fit that deal with fragmentation come into play)
Now let's say the process accesses that memory, how is that done?
my answer: The memory address, as seen by the process, is in fact virtual. The OS checks if that address is currently mapped to a physical memory address and if so performs the access. If it isn't mapped - a page fault is raised.
Am I getting this straight? i.e. the compiler\runtime library are in charge of allocating virtual memory blocks, and the OS is in charge of a mapping between processes' virtual address and physical addresses (and the paging algorithm that entails)?
Thanks!
About right. The memory needs to exist in the virtual memory of the process for a page fault to actually allocate a physical page though. You can't just start poking around anywhere and expect the kernel to put physical memory where you happen to access.
There is much more to it than this. Read up on mmap(), anonymous and not, shared and private. And brk() too. malloc() builds on brk() and mmap().
You've almost got it. The one thing you missed is how the process asks the system for more virtual memory in the first place. As Thomas pointed out, you can't just write where you want. There's no reason an OS couldn't be designed to allow that, but it's much more efficient if it has some idea where you're going to be writing and the space where you do it is contiguous.
On Unixy systems, userland processes have a region called the data segment, which is what it sounds like: it's where the data goes. When a process needs memory for data, it calls brk(), which asks the system to extend the data segment to a specified pointer value. (For example, if your existing data segment was empty and you wanted to extend it to 2M, you'd call brk(0x200000).)
Note that while very common, brk() is not a standard; in fact it was yanked out of POSIX.1 a decade ago because C specifies malloc() and there's no reason to mandate the interface for data segment allocation.

Resources