I'm studying up on OS memory management, and I wish to verify that I got the basic mechanism of allocation \ virtual memory \ paging straight.
Let's say a process calls malloc(), what happens behind the scenes?
my answer: The runtime library finds an appropriately sized block of memory in its virtual memory address space.
(This is where allocation algorithms such as first-fit, best-fit that deal with fragmentation come into play)
Now let's say the process accesses that memory, how is that done?
my answer: The memory address, as seen by the process, is in fact virtual. The OS checks if that address is currently mapped to a physical memory address and if so performs the access. If it isn't mapped - a page fault is raised.
Am I getting this straight? i.e. the compiler\runtime library are in charge of allocating virtual memory blocks, and the OS is in charge of a mapping between processes' virtual address and physical addresses (and the paging algorithm that entails)?
Thanks!
About right. The memory needs to exist in the virtual memory of the process for a page fault to actually allocate a physical page though. You can't just start poking around anywhere and expect the kernel to put physical memory where you happen to access.
There is much more to it than this. Read up on mmap(), anonymous and not, shared and private. And brk() too. malloc() builds on brk() and mmap().
You've almost got it. The one thing you missed is how the process asks the system for more virtual memory in the first place. As Thomas pointed out, you can't just write where you want. There's no reason an OS couldn't be designed to allow that, but it's much more efficient if it has some idea where you're going to be writing and the space where you do it is contiguous.
On Unixy systems, userland processes have a region called the data segment, which is what it sounds like: it's where the data goes. When a process needs memory for data, it calls brk(), which asks the system to extend the data segment to a specified pointer value. (For example, if your existing data segment was empty and you wanted to extend it to 2M, you'd call brk(0x200000).)
Note that while very common, brk() is not a standard; in fact it was yanked out of POSIX.1 a decade ago because C specifies malloc() and there's no reason to mandate the interface for data segment allocation.
Related
According to "Windows Internals, Part 1" (7th Edition, Kindle version):
Pages in a process virtual address space are either free, reserved, committed, or shareable.
Focusing only on the reserved and committed pages, the first type is described in the same book:
Reserving memory means setting aside a range of contiguous virtual addresses for possible future use (such as an array) while consuming negligible system resources, and then committing portions of the reserved space as needed as the application runs. Or, if the size requirements are known in advance, a process can reserve and commit in the same function call.
Both reserving or committing will initially get you entries in the VADs (virtual address descriptors), but neither operation will touch the PTE (page table entries) structures. It used to cost PTEs for reserving before Windows 8.1, but not anymore.
As described above, reserved means blocking a range of virtual addresses, NOT blocking physical memory or paging file space at the OS level. The OS doesn't include this in the commit limit, therefore when the time comes to allocate this memory, you might get a surprise. It's important to note that reserving happens from the perspective of the process address space. It's not that there's any physical resource reserved - there's no stamping of "no vacancy" against RAM space or page file(s).
The analogy with plots of land might be missing something: take reserved as the area of land surrounded by wooden poles, thus letting others now that the land is taken. But how about committed ? It can't be land on which structures (eg houses) have already been build, since those would require PTEs and there's none there yet, since we haven't accessed anything. It's only when touching committed data that the PTEs will get built, which will make the pages available to the process.
The main problem is that committed memory - at least in its initial state - is functionally very much alike reserved memory. It's just an area blocked within VADs. Try to touch one of the addresses, and you'll get an access violation exception for a reserved address:
Attempting to access free or reserved memory results in an access violation exception because the page isn’t mapped to any storage that can resolve the reference
...and an initial page fault for a committed one (immediately followed by the required PTE entries being created).
Back to the land analogy, once houses are build, that patch of land is still committed. Yet this is a bit peculiar, since it was still committed when the original grass was there, before the very first shovel was excavated to start construction. It resembled the same state as that of a reserved patch. Maybe it would be better to think of it like terrain eligible for construction. Eg you have a permit to build (albeit you might never build as much as a wall on that patch of land).
What would be the reasons for using one type of memory versus the other ? There's at least one: the OS guarantees that there will be room to allocate committed memory, should that ever occur in the future, but doesn't guarantee anything for reserved memory aside from blocking that process' address space range. The only downside for committed memory is that one or more paging files might need to be extended in size as to be able to make the commit limit take into account the recently allocated block, so should the requester demand the use of part of all the data in the future, the OS can provide access to it.
I can't really think how the land analogy can capture this detail of "guarantee". After all, the reserved patch also physically existed, covered by the same grass as a committed one in its pristine state.
The stack is another scenario where reserved and committed memory are used together:
When a thread is created, the memory manager automatically reserves a predetermined amount of virtual memory, which by default is 1 MB.[...] Although 1 MB is reserved, only the first page of the stack will be committed [...]
along with a guard page. When a thread’s stack grows large enough to touch the guard page, an exception occurs, causing an attempt to allocate another guard. Through this mechanism, a user stack doesn’t immediately consume all 1 MB of committed memory but instead grows with demand."
There is an answer here that deals with why one would want to use reserved memory as opposed to committed . It involves storing continuously expanding data - which is actually the stack model described above - and having specific absolute address ranges available when needed (although I'm not sure why one would want to do that within a process).
Ok, what am I actually asking ?
What would be a good analogy for the reserved/committed concept ?
Any other reason aside those depicted above that would mandate the
use of reserved memory ? Are there any interesting use cases when
resorting to reserved memory is a smart move ?
Your question hits upon the difference between logical memory translation and virtual memory translation. While CPU documentation likes to conflate these two concepts, they are different in practice.
If you look at logical memory translation, there are are only two states for a page. Using your terminology, they are FREE and COMMITTED. A free page is one that has no mapping to a physical page frame and a COMMITTED page has such a mapping.
In a virtual memory system, the operating system has to maintain a copy of the address space in secondary storage. How this is done depends upon the operating system. Typically, a process will have its mapping to several different files for secondary storage. The operating system divides the address space into what is usually called a SECTION.
For example, the code and read only data could be stored virtually as one or more SECTIONS in the executable file. Code and static data in shared libraries could each be in a different section that are paged to the shared libraries. You might have a map to a shared filed to the process that uses memory that can be accessed by multiple processes that forms another section. Most of the read/write data is likely to be in a page file in one or more sections. How the operating system tracks where it virtually stores each section of data is system dependent.
For windows, that gives the definition of one of your terms: Sharable. A sharable section is one where a range of addresses can be mapped to different processes, at different (or possibly the same) logical addresses.
Your last term is then RESERVED. If you look at the Windows' VirtualAlloc function documentation, you can see that (among your options) you can RESERVE or COMMIT. If you reserve you are creating a section of VIRTUAL MEMORY that has no mapping to physical memory.
This RESERVE/COMMIT model is Windows-specific (although other operating systems may do the same). The likely reason was to save disk space. When Windows NT was developed, 600MB drives the size of washing machine were still in use.
In these days of 64-bit address spaces, this system works well for (as you say) expanding data. In theory, an exception handler for a stack overrun can simply expand the stack. Reserving 4GB of memory takes no more resources than reserving a single page (which would not be practicable in a 32-bit system—see above). If you have 20 threads, this makes reserving stack space efficient.
What would be a good analogy for the reserved/committed concept ?
One could say RESERVE is like buying options to buy and COMMIT is exercising the option.
Any other reason aside those depicted above that would mandate the use of reserved memory ? Are there any interesting use cases when resorting to reserved memory is a smart move ?
IMHO, the most likely places to RESERVE without COMMITTING are for creating stacks and heaps with the former being the most important.
I'm a bit of a noob when it comes to kernel programming, and was wondering if anyone could point me in the right direction for beginning the implementation of memory management in a kernel setting. I am currently working on a toy kernel and am doing a lot of research on the subject but I'm a bit confused on the topic of memory management. There are so many different aspects to it like paging and virtual memory mapping. Is there a specific order that I should implement things or any do's and dont's? I'm not looking for any code or anything, I just need to be pointed in the right direction. Any help would be appreciated.
There are multiple aspects that you should consider separately:
Managing the available physical memory.
Managing the memory required by the kernel and it's data structures.
Managing the virtual memory (space) of every process.
Managing the memory required by any process, i.e. malloc and free.
To be able to manage any of the other memory demands you need to know actually how much physical memory you have available and what parts of it are available to your use.
Assuming your kernel is loaded by a multiboot compatible boot loader you'll find this information in the multiboot header that you get passed (in eax on x86 if I remember correctly) from the boot loader.
The header contains a structure describing which memory areas are used and which are free to use.
You also need to store this information somehow, and keep track of what memory is allocated and freed. An easy method to do so is to maintain a bitmap, where bit N indicates whether the (fixed size S) memory area from N * S to (N + 1) * S - 1 is used or free. Of course you probably want to use more sophisticated methods like multilevel bitmaps or free lists as your kernel advances, but a simple bitmap as above can get you started.
This memory manager usually only provides "large" sized memory chunks, usually multiples of 4KB. This is of course of no use for dynamic memory allocation in style of malloc and free that you're used to from applications programming.
Since dynamic memory allocation will greatly ease implementing advanced features of your kernel (multitasking, inter process communication, ...) you usually write a memory manager especially for the kernel. It provides means for allocation (kalloc) and deallocation (kfree) of arbitrary sized memory chunks. This memory is from pool(s) that are allocated using the physical memory manager from above.
All of the above is happening inside the kernel. You probably also want to provide applications means to do dynamic memory allocation. Implementing this is very similar in concept to the management of physical memory as done above:
A process only sees its own virtual address space. Some parts of it are unusable for the process (for example the area where the kernel memory is mapped into), but most of it will be "free to use" (that is, no actually physical memory is associated with it). As a minimum the kernel needs to provide applications means to allocate and free single pages of its memory address space. Allocating a page results (under the hood, invisible to the application) in a call to the physical memory manager, and in a mapping from the requested page to this newly allocated memory.
Note though that many kernels provide its processes either more sophisticated access to their own address space or directly implement some of the following tasks in the kernel.
Being able to allocate and free pages (4KB mostly) as before doesn't help with dynamic memory management, but as before this is usually handled by some other memory manager which is using these large memory chunks as pool to provide smaller chunks to the application. A prominent example is Doug Lea's allocator. Memory managers like these are usually implemented as library (part of the standard library most likely) that is linked to every application.
In modern-day operating systems, memory is available as an abstracted resource. A process is exposed to a virtual address space (which is independent from address space of all other processes) and a whole mechanism exists for mapping any virtual address to some actual physical address.
My doubt is:
If each process has its own address space, then it should be free to access any address in the same. So apart from permission restricted sections like that of .data, .bss, .text etc, one should be free to change value at any address. But this usually gives segmentation fault, why?
For acquiring the dynamic memory, we need to do a malloc. If the whole virtual space is made available to a process, then why can't it directly access it?
Different runs of a program results in different addresses for variables (both on stack and heap). Why is it so, when the environments for each run is same? Does it not affect the amount of addressable memory available for usage? (Does it have something to do with address space randomization?)
Some links on memory allocation (e.g. in heap).
The data available at different places is very confusing, as they talk about old and modern times, often not distinguishing between them. It would be helpful if someone could clarify the doubts while keeping modern systems in mind, say Linux.
Thanks.
Technically, the operating system is able to allocate any memory page on access, but there are important reasons why it shouldn't or can't:
different memory regions serve different purposes.
code. It can be read and executed, but shouldn't be written to.
literals (strings, const arrays). This memory is read-only and should be.
the heap. It can be read and written, but not executed.
the thread stack. There is no reason for two threads to access each other's stack, so the OS might as well forbid that. Moreover, the tread stack can be de-allocated when the tread ends.
memory-mapped files. Any changes to this region should affect a specific file. If the file is open for reading, the same memory page may be shared between processes because it's read-only.
the kernel space. Normally the application should not (or can not) access that region - only kernel code can. It's basically a scratch space for the kernel and it's shared between processes. The network buffer may reside there, so that it's always available for writes, no matter when the packet arrives.
...
The OS might assume that all unrecognised memory access is an attempt to allocate more heap space, but:
if an application touches the kernel memory from user code, it must be killed. On 32-bit Windows, all memory above 1<<31 (top bit set) or above 3<<30 (top two bits set) is kernel memory. You should not assume any unallocated memory region is in the user space.
if an application thinks about using a memory region but doesn't tell the OS, the OS may allocate something else to that memory (OS: sure, your file is at 0x12341234; App: but I wanted to store my data there). You could tell the OS by touching the end of your array (which is unreliable anyways), but it's easier to just call an OS function. It's just a good idea that the function call is "give me 10MB of heap", not "give me 10MB of heap starting at 0x12345678"
If the application allocates memory by using it then it typically does not de-allocate at all. This can be problematic as the OS still has to hold the unused pages (but the Java Virtual Machine does not de-allocate either, so hey).
Different runs of a program results in different addresses for variables
This is called memory layout randomisation and is used, alongside of proper permissions (stack space is not executable), to make buffer overflow attacks much more difficult. You can still kill the app, but not execute arbitrary code.
Some links on memory allocation (e.g. in heap).
Do you mean, what algorithm the allocator uses? The easiest algorithm is to always allocate at the soonest available position and link from each memory block to the next and store the flag if it's a free block or used block. More advanced algorithms always allocate blocks at the size of a power of two or a multiple of some fixed size to prevent memory fragmentation (lots of small free blocks) or link the blocks in a different structures to find a free block of sufficient size faster.
An even simpler approach is to never de-allocate and just point to the first (and only) free block and holds its size. If the remaining space is too small, throw it away and ask the OS for a new one.
There's nothing magical about memory allocators. All they do is to:
ask the OS for a large region and
partition it to smaller chunks
without
wasting too much space or
taking too long.
Anyways, the Wikipedia article about memory allocation is http://en.wikipedia.org/wiki/Memory_management .
One interesting algorithm is called "(binary) buddy blocks". It holds several pools of a power-of-two size and splits them recursively into smaller regions. Each region is then either fully allocated, fully free or split in two regions (buddies) that are not both fully free. If it's split, then one byte suffices to hold the size of the largest free block within this block.
When a segment fault occurs, it means I access memory which is not allocated or protected.But How does the kernel or CPU know it? Is it implemented by the hardware? What data structures need the CPU to look up? When a set of memory is allocated, what data structures need to be modified?
The details will vary, depending on what platform you're talking about, but typically the MMU will generate an exception (interrupt) when you attempt an invalid memory access and the kernel will then handle this as part of an interrupt service routine.
A seg fault generally happens when a process attempts to access memory that the CPU cannot physically address. It is the hardware that notifies the OS about a memory access violation. The OS kernel then sends a signal to the process which caused the exception
To answer the second part of your question, again it depends on hardware and OS. In a typical system (i.e. x86) the CPU consults the segment registers (via the global or local descriptor tables) to turn the segment relative address into a virtual address (this is usually, but not always, a no-op on modern x86 operating systems), and then (the MMU does this bit really, but on x86 its part of the CPU) consults the page tables to turn that virtual address into a physical address. When it encounters a page which is not marked present (the present bit is not set in the page directory or tables) it raises an exception. When the OS handles this exception, it will either give up (giving rise to the segfault signal you see when you make a mistake or a panic) or it will modify the page tables to make the memory valid and continue from the exception. Typically the OS has some bookkeeping which says which pages could be valid, and how to get the page. This is how demand paging occurs.
It all depends on the particular architecture, but all architectures with paged virtual memory work essentially the same. There are data structures in memory that describe the virtual-to-physical mapping of each allocated page of memory. For every memory access, the CPU/MMU hardware looks up those tables to find the mapping. This would be horribly slow, of course, so there are hardware caches to speed it up.
Consider a system as follows: a hardware board having say ARM Cortex-A8 and Neon Vector coprocessor, and Embedded Linux OS running on Cortex-A8. On this environment, if some application - say, a video decoder - is executing, then:
How is it decided which buffers would be in external memory, which ones would be allocated in internal SRAM, etc.
When one calls calloc/malloc on such a system/code, the pointer returned is from which memory: internal or external?
Can a user make buffers to be allocated in the memories of his choice (internal/external)?
In ARM architectures, there is another memory called "tightly coupled memory" (TCM). What is that and how can user enable and use it? Can I declare buffers in this memory?
Do I need to see the memory map (if any) of the hardware board to understand about all these different physical memories present in a typical hardware board?
How much of a role does the OS play in distinguishing these different memories?
Sorry for multiple questions, but i think they all are interlinked.
Please note that I'm not familiar with the ARM nor embedded Linux's specifically, so all of my comments will be from a general point of view.
First, about cache: Very early during boot, the operating system will do some amount of cache initialization. Exactly what this entails will vary from processor to processor, but the net effect is to ensure cache is initialized properly, and then enable its use by the processor. After this, the cache is operated exclusively by the processor with no further interaction by the operating system or your programs.
Now, on to external (off-chip) and internal (on-chip) memories:
The operating system owns all hardware on the system, including the internal and external memories and so is ultimately responsible for discovering, configuring, and allocating these resources within the kernel and to user processes. In a typical system (eg, your desktop or a 1u server) there won't usually be any special internal (on-chip) ram, and so the operating system can treat all dram equally. It will go into a general pool of pages (usually 4k) for allocation to processes, file system buffers, etc. On a system with special memory of various sorts (nvram, high-speed on-chip memory, and a few others), the operating system's general policies aren't usually correct.
How this is presented to the user will depend on choices made while porting the OS to this system.
One could modify the OS to be explicitly aware of this special memory, and provide special system calls to allocate it to to user land processes. However, this could be quite a bit of work unless the embedded linux being used has at least some support for this sort of thing.
The approach I'd probably take would be to avoid modifying the kernel itself, and instead write a device driver for the internal memory. A driver of this sort would typically provide some sort of mmap interface to allow user processes to get simple address-based access to the internal memory.
Here are answers to some of your concrete questions.
How much of a role does the OS play in distinguishing these different memories?
If your system has taken the device driver approach described above, then the OS probably knows only about external memory, or perhaps just enough about the internal memories to initialize them properly although that would likely be in the device driver too, if at all possible. If the OS knows more explicitly about the on-chip memory, then it will definitely contain any needed initialization code, as well as some sort of scheme to provide access to the user processes.
How is it decided which buffers would be in external memory, which ones would be allocated in internal SRAM, etc.
It seems unlikely to me that the operating system would try to automate such choices. Instead, I suspect that either the OS or a device driver would provide a generic interface to provide access to the on-chip memory, and leave it up to your user code to decide what to do with it.
When one calls calloc/malloc on such a system/code, the pointer returned is from which memory: internal or external?
Almost certainly, malloc and friends will return pointers into the general off-chip memory. In the driver-based approach suggested above, you'd use mmap to gain access to the on-chip memory. If you needed to do finer-grained allocation than that, you'd need to write your own allocator, or find one that can be given an explicit region of memory to work in.
Can a user make buffers to be allocated in the memories of his choice (internal/external)?
If by buffers you mean the regions returned from the standard malloc calls, probably not. But, if you mean "can a user program somehow get a pointer to the on-chip memory", then the answer is almost certainly yes, but the mechanism will depend on choices made when porting linux to this system.
In ARM architectures, there is another memory called "tightly coupled memory" (TCM). What is that and how can user enable and use it? Can I declare buffers in this memory?
I don't know what this is. If I had to guess, I'd assume it's just another form of on-chip ram, but since it has a different name, perhaps I'm wrong.
Do I need to see the memory map (if any) of the hardware board to understand about all these different physical memories present in a typical hardware board?
If the OS and/or device drivers have provided some sort of abstract access to these memory regions, then you won't need to know explicitly about the address map. This knowledge is, however, needed to implement this access in either the kernel or a device driver.
I hope this helps somewhat.