Inside Dynamics memory management - memory

i am student and want to know more about the dynamics memory management. For C++, calling operator new() can allocate a memory block under the Heap(Free Store ). In fact, I have not a full picture how to achieve it.
There are a few questions:
1) What is the mechanism that the OS can allocate a memory block?? As I know, there are some basic memory allocation schemes like first-fit, best-fit and worst-fit. Does OS use one of them to allocate memory dynamically under the heap?
2) For different platform like Android, IOS, Window and so on, are they used different memory allocation algorithms to allocate a memory block?
3) For C++, when i call operator new() or malloc(), Does the memory allocator allocate a memory block randomly in the heap?
Hope anyone can help me.
Thanks

malloc is not a system call, it is library (libc) routine which goes through some of its internal structures to give you address of a free piece of memory of the required size. It only does a system call if the process' data segment (i.e. virtual memory it can use) is not "big enough" according to the logic of malloc in question. (On Linux, the system call to enlarge data segment is brk)
Simply said, malloc provides fine-grained memory management, while OS manages coarser, big chunks of memory made available to that process.
Not only different platforms, but also different libraries use different malloc; some programs (e.g. python) use its internal allocator instead as they know its own usage patterns and can increase performance that way.
There is a longthy article about malloc at wikipedia.

Related

Using external and internal memory for heap

I have hooked up external SRAM memory in my project. What I want to do is to use malloc() to store data in external OR internal memory in runtime. How can I decide during code execution in which memory store heap data with malloc? I know I have to edit linker script but after that it will store ALL heap data in external memory.
Is there any linker command that can say to allocate next malloc() in external or internal memory? For stack data we can use attribute((section("name"))) variable attribute but is there anything for heap?
Thank you!
malloc from your C library can generally only use memory from one location. If you use newlib then it finds this memory using _sbrk. The default implementation of _sbrk depends on the definition of the symbol end or _end by the linker script, but you can also implement your own.
You will have to pick one location for malloc to access, and use your own custom function to allocate memory from somewhere else.
Many libraries and RTOS implementations do this. See for example mem_malloc in LwIP or rt_alloc_mem in Keil RTX
There are many schemes you can use to decide which memory to use, for example having pools of fixed size blocks for a particular purpose. I tend to use the fastest internal SRAM for malloc because it will become quite fragmented. I then make sure to only use malloc for small things and then custom functions for larger allocations.

In programming environments that have automatic memory management, how often are the OS memory allocation routines invoked at runtime?

Do implementations pre-allocate blocks of memory for objects using malloc? When these blocks are used up, will additional memory be requested? When garbage collection runs and compaction occurs, will memory be returned to the OS via calls to free?
Do implementations pre-allocate blocks of memory for objects using malloc?
Yes. Most often they pre-allocate continuous blocks of memory and implement they own allocation mechanism inside (for example based on allocation pointer - pointing the memory address for the next object so allocating an object is simply returning this address and moving this pointer by given amount of bytes). This is faster than relying on OS calls and gives better control of those memory regions. For example, in case of CLR on Windows, those blocks are called segments and are managed via VirtualAlloc/VirtualFree calls. First quite a big memory region is reserved and then more and more pages are being committed as they are needed. Malloc (or more general - HeapAPI in case of Windows) is not used in CLR.
When these blocks are used up, will additional memory be requested?
Yes, they may be more blocks created but first they grow "inside" by committing (consuming) reserved memory.
When garbage collection runs and compaction occurs, will memory be returned to the OS via calls to free?
It depends on specific runtime implementation but you should not look at it as a main memory reclamation mechanism. Compaction works inside those preallocated memory blocks - for example, allocation pointer will be moved back to the left after compaction occurred. But yes, in general, segments may be returned to OS when GC decides that it is no longer needed (like all objects living inside have been reclaimed). However, on 32-bit architectures with quite limited virtual memory space it could lead to unwanted memory fragmentation and reusing such memory block was a better option. On 64-bit this may not be so big problem, however, reusing those blocks still may be a just good idea.

Kernel memory management: where do I begin?

I'm a bit of a noob when it comes to kernel programming, and was wondering if anyone could point me in the right direction for beginning the implementation of memory management in a kernel setting. I am currently working on a toy kernel and am doing a lot of research on the subject but I'm a bit confused on the topic of memory management. There are so many different aspects to it like paging and virtual memory mapping. Is there a specific order that I should implement things or any do's and dont's? I'm not looking for any code or anything, I just need to be pointed in the right direction. Any help would be appreciated.
There are multiple aspects that you should consider separately:
Managing the available physical memory.
Managing the memory required by the kernel and it's data structures.
Managing the virtual memory (space) of every process.
Managing the memory required by any process, i.e. malloc and free.
To be able to manage any of the other memory demands you need to know actually how much physical memory you have available and what parts of it are available to your use.
Assuming your kernel is loaded by a multiboot compatible boot loader you'll find this information in the multiboot header that you get passed (in eax on x86 if I remember correctly) from the boot loader.
The header contains a structure describing which memory areas are used and which are free to use.
You also need to store this information somehow, and keep track of what memory is allocated and freed. An easy method to do so is to maintain a bitmap, where bit N indicates whether the (fixed size S) memory area from N * S to (N + 1) * S - 1 is used or free. Of course you probably want to use more sophisticated methods like multilevel bitmaps or free lists as your kernel advances, but a simple bitmap as above can get you started.
This memory manager usually only provides "large" sized memory chunks, usually multiples of 4KB. This is of course of no use for dynamic memory allocation in style of malloc and free that you're used to from applications programming.
Since dynamic memory allocation will greatly ease implementing advanced features of your kernel (multitasking, inter process communication, ...) you usually write a memory manager especially for the kernel. It provides means for allocation (kalloc) and deallocation (kfree) of arbitrary sized memory chunks. This memory is from pool(s) that are allocated using the physical memory manager from above.
All of the above is happening inside the kernel. You probably also want to provide applications means to do dynamic memory allocation. Implementing this is very similar in concept to the management of physical memory as done above:
A process only sees its own virtual address space. Some parts of it are unusable for the process (for example the area where the kernel memory is mapped into), but most of it will be "free to use" (that is, no actually physical memory is associated with it). As a minimum the kernel needs to provide applications means to allocate and free single pages of its memory address space. Allocating a page results (under the hood, invisible to the application) in a call to the physical memory manager, and in a mapping from the requested page to this newly allocated memory.
Note though that many kernels provide its processes either more sophisticated access to their own address space or directly implement some of the following tasks in the kernel.
Being able to allocate and free pages (4KB mostly) as before doesn't help with dynamic memory management, but as before this is usually handled by some other memory manager which is using these large memory chunks as pool to provide smaller chunks to the application. A prominent example is Doug Lea's allocator. Memory managers like these are usually implemented as library (part of the standard library most likely) that is linked to every application.

How do programs allocate large amounts of memory?

I have 3 questions concerning memory allocation that I thought better to put into one question than 3.
When memory is allocated as I understand, it is allocated on the heap, which is just 16mb. How hen do programs such as video games or modern browsers manage to use over 1GB?
Since it is obviously possible for this much memory to be used, why can it not be allocated at the start? I have found the most I can allocate in High Level Assembly language is around 100MB. This is a lot more than 16MB, and far less than I have 3, so where does this limitation come from?
Why allocate memory in the first place, rather than allocating variables and letting the compiler/system handle it?
When memory is allocated as I understand, it is allocated on the heap,
which is just 16mb. How hen do programs such as video games or modern
browsers manage to use over 1GB?
The heap can grow. It isn't limited to any value and certainly not 16MB. You can easily allocate 1GB of heap, just make a program test and you'll see.
Since it is obviously possible for this much memory to be used, why
can it not be allocated at the start? I have found the most I can
allocate in High Level Assembly language is around 100MB. This is a
lot more than 16MB, and far less than I have 3, so where does this
limitation come from?
I'm not sure why your OS isn't filling larger allocation requests. Perhaps due to memory fragmentation? It's going to be a problem specific to your setup, which you didn't share. I can allocation much more memory than that without an issue.
You can try to use the mmap system call if malloc (which uses the brk system call) is having some sort of issue. Note that for GNU libc, malloc actually uses mmap instead of brk when the allocation is large enough (over 128k I think).
Why allocate memory in the first place, rather than allocating
variables and letting the compiler/system handle it?
Variable must live in memory somewhere. What you are saying is "why manually manage memory? Why can't some algorithm do that for me?". It is actually very common for the compiler and a runtime component to handle allocation/freeing - it's called garbage collection.

Multithreaded Heap Management

In C/C++ I can allocate memory in one thread and delete it in another thread. Yet whenever one requests memory from the heap, the heap allocator needs to walk the heap to find a suitably sized free area. How can two threads access the same heap efficiently without corrupting the heap? (Is this done by locking the heap?)
In general, you do not need to worry about the thread-safety of your memory allocator. All standard memory allocators -- that is, those shipped with MacOS, Windows, Linux, etc. -- are thread-safe. Locks are a standard way of providing thread-safety, though it is possible to write a memory allocator that only uses atomic operations rather than locks.
Now it is an entirely different question whether those memory allocators scale; that is, is their performance independent of the number of threads performing memory operations? In most cases, the answer is no; they either slow down or can consume a lot more memory. The first scalable allocator in both dimensions (speed and space) is Hoard (which I wrote); the Mac OS X allocator is inspired by it -- and cites it in the documentation -- but Hoard is faster. There are others, including Google's tcmalloc.
Yes an "ordinary" heap implementation supporting multithreaded code will necessarily include some sort of locking to ensure correct operation. Under fairly extreme conditions (a lot of heap activity) this can become a bottleneck; more specialized heaps (generally providing some sort of thread-local heap) are available which can help in this situation. I've used Intel TBB's "scalable allocator" to good effect. tcmalloc and jemalloc are other examples of mallocs implemented with multithreaded scaling in mind.
Some timing comparisons comparisons between single threaded and multithread-aware mallocs here.
This is an Operating Systems question, so the answer is going to depend on the OS.
On Windows, each process gets its own heap. That means multiple threads in the same process are (by default) sharing a heap. Thus the OS has to thread-synchronize its allocation and deallocation calls to prevent heap corruption. If you don't like the idea of the possible contention that may ensue, you can get around it by using the Heap* routines. You can even overload malloc (in C) and new (in C++) to call them.
I found this link.
Basically, the heap can be divided into arenas. When requesting memory, each arena is checked in turn to see whether it is locked. This means that different threads can access different parts of the heap at the same time safely. Frees are a bit more complicated because each free must be freed from the arena that it was allocated from. I imagine a good implementation will get different threads to default to different arenas to try to minimize contention.
Yes, normally access to the heap has to be locked. Any time you have a shared resource, that resource needs to be protected; memory is a resource.
This will depend heavily on your platform/OS, but I believe this is generally OK on major sytems. C/C++ do not define threads, so by default I believe the answer is "heap is not protected", that you must have some sort of multithreaded protection for your heap access.
However, at least with linux and gcc, I believe that enabling -pthread will give you this protection automatically...
Additionally, here is another related question:
C++ new operator thread safety in linux and gcc 4

Resources