Is memory allocated by JVM , & where all the data stored in RAM or Hard-disk. Or memory allocated by constructor, if yes then how memory allocated of static class members.?
JVM creates a memory area called 'The Heap' on startup where all of the application memory goes. This is created in RAM and is garbage collected when full. See the Memory Management documentation for details -> Understanding Memory Management
Roughly, there are different kinds of allocations: object data (reference types) is stored in so called heap, primitive data and object pointers are stored on stack. Both stack and heap are in RAM, in a JVM process memory area.
Effectively, object allocations are done by constructor, or, more specifically, new call.
This awesome article explains allocations in a more precise and correct way.
Related
I want to know technical details about garbage collection (GC) and memory management in Erlang/OTP.
But, I cannot find on erlang.org and its documents.
I have found some articles online which talk about GC in a very general manner, such as what garbage collection algorithm is used.
To classify things, lets define the memory layout and then talk about how GC works.
Memory Layout
In Erlang, each thread of execution is called a process. Each process has its own memory and that memory layout consists of three parts: Process Control Block, Stack and Heap.
PCB: Process Control Block holds information like process identifier (PID), current status (running, waiting), its registered name, and other such info.
Stack: It is a downward growing memory area which holds incoming and outgoing parameters, return addresses, local variables and temporary spaces for evaluating expressions.
Heap: It is an upward growing memory area which holds process mailbox messages and compound terms. Binary terms which are larger than 64 bytes are NOT stored in process private heap. They are stored in a large Shared Heap which is accessible by all processes.
Garbage Collection
Currently Erlang uses a Generational garbage collection that runs inside each Erlang process private heap independently, and also a Reference Counting garbage collection occurs for global shared heap.
Private Heap GC: It is generational, so divides the heap into two segments: young and old generations. Also there are two strategies for collecting; Generational (Minor) and Fullsweep (Major). The generational GC just collects the young heap, but fullsweep collect both young and old heap.
Shared Heap GC: It is reference counting. Each object in shared heap (Refc) has a counter of references to it held by other objects (ProcBin) which are stored inside private heap of Erlang processes. If an object's reference counter reaches zero, the object has become inaccessible and will be destroyed.
To get more details and performance hints, just look at my article which is the source of the answer: Erlang Garbage Collection Details and Why It Matters
A reference paper for the algorithm: One Pass Real-Time Generational Mark-Sweep Garbage Collection (1995) by Joe Armstrong and Robert Virding in
1995 (at CiteSeerX)
Abstract:
Traditional mark-sweep garbage collection algorithms do not allow reclamation of data until the mark phase of the algorithm has terminated. For the class of languages in which destructive operations are not allowed we can arrange that all pointers in the heap always point backwards towards "older" data. In this paper we present a simple scheme for reclaiming data for such language classes with a single pass mark-sweep collector. We also show how the simple scheme can be modified so that the collection can be done in an incremental manner (making it suitable for real-time collection). Following this we show how the collector can be modified for generational garbage collection, and finally how the scheme can be used for a language with concurrent processes.1
Erlang has a few properties that make GC actually pretty easy.
1 - Every variable is immutable, so a variable can never point to a value that was created after it.
2 - Values are copied between Erlang processes, so the memory referenced in a process is almost always completely isolated.
Both of these (especially the latter) significantly limit the amount of the heap that the GC has to scan during a collection.
Erlang uses a copying GC. During a GC, the process is stopped then the live pointers are copied from the from-space to the to-space. I forget the exact percentages, but the heap will be increased if something like only 25% of the heap can be collected during a collection, and it will be decreased if 75% of the process heap can be collected. A collection is triggered when a process's heap becomes full.
The only exception is when it comes to large values that are sent to another process. These will be copied into a shared space and are reference counted. When a reference to a shared object is collected the count is decreased, when that count is 0 the object is freed. No attempts are made to handle fragmentation in the shared heap.
One interesting consequence of this is, for a shared object, the size of the shared object does not contribute to the calculated size of a process's heap, only the size of the reference does. That means, if you have a lot of large shared objects, your VM could run out of memory before a GC is triggered.
Most if this is taken from the talk Jesper Wilhelmsson gave at EUC2012.
I don't know your background, but apart from the paper already pointed out by jj1bdx you can also give a chance to Jesper Wilhelmsson thesis.
BTW, if you want to monitor memory usage in Erlang to compare it to e.g. C++ you can check out:
Erlang Instrument Module
Erlang OS_MON Application
Hope this helps!
Do implementations pre-allocate blocks of memory for objects using malloc? When these blocks are used up, will additional memory be requested? When garbage collection runs and compaction occurs, will memory be returned to the OS via calls to free?
Do implementations pre-allocate blocks of memory for objects using malloc?
Yes. Most often they pre-allocate continuous blocks of memory and implement they own allocation mechanism inside (for example based on allocation pointer - pointing the memory address for the next object so allocating an object is simply returning this address and moving this pointer by given amount of bytes). This is faster than relying on OS calls and gives better control of those memory regions. For example, in case of CLR on Windows, those blocks are called segments and are managed via VirtualAlloc/VirtualFree calls. First quite a big memory region is reserved and then more and more pages are being committed as they are needed. Malloc (or more general - HeapAPI in case of Windows) is not used in CLR.
When these blocks are used up, will additional memory be requested?
Yes, they may be more blocks created but first they grow "inside" by committing (consuming) reserved memory.
When garbage collection runs and compaction occurs, will memory be returned to the OS via calls to free?
It depends on specific runtime implementation but you should not look at it as a main memory reclamation mechanism. Compaction works inside those preallocated memory blocks - for example, allocation pointer will be moved back to the left after compaction occurred. But yes, in general, segments may be returned to OS when GC decides that it is no longer needed (like all objects living inside have been reclaimed). However, on 32-bit architectures with quite limited virtual memory space it could lead to unwanted memory fragmentation and reusing such memory block was a better option. On 64-bit this may not be so big problem, however, reusing those blocks still may be a just good idea.
I'm studying about operating systems currently and I am a bit confused.
When a process is started for the first time, does the OS know the size of the heap? (I am guessing it knows the size of the data & code segments)
Heap is just a concept. There is no real, single heap. A heap is a block of memory that can be used for dynamic memory requests. A heap is created by library routines that allocate dynamic memory. There can be many heaps or no heap at all.
The OS never knows the size of the process heap.
I suspect the answer to my question is language specific, so I'd like to know about C and C++. When I call free() on a buffer or use delete[], how does the program know how much memory to free?
Where is the size of the buffer or of the dynamically allocated array stored and why isn't it available to the programmer as well?
Each implementation will be different, but typically the runtime allocates a bit more than asked for, and uses some hidden fields at the start of the block to remember the allocated size. The address returned to the caller is therefore offset a bit from the start of the memory claimed from the heap.
It isn't available to the caller because the true amount of memory claimed from the heap is an implementation detail, and will vary between compilers and platforms. As for knowing how much the caller asked for, rather than how much was allocated from the heap... well, the language designers assume that the programmer is capable of remembering this if needed.
The heap keeps track of all memory blocks, both allocated and free, specifically for that purpose. Typical (if naive) implemenation allocates memory, uses several bytes in the beginning for bookkeeping, and returns the address past those bytes. On subsequent operations (free/realloc), it would subtract a few bytes to get to the bookkeeping area.
Some heap implementations (say, Windows' GlobalAlloc()) let you know the block size given the starting address. But in the C/C++ RTL heap, no such service.
Note that the malloc() sometimes overallocates memory, so the information about mallocated block size would be of limited utility. C++ new[]'ed arrays, that's a whole another matter - for those, knowing exact array size is essential for array destruction to work properly. Still, there's no such thing in C++ as a dynamic_sizeof operator.
The memory allocator that gave you that chunk of memory is responsible for all that maintenance data. Typically it's stored in the beginning of the chunk (right before the actual address you use) so it's easy to access on freeing.
Regarding to your other question: why should your app know about it? It's not your concern. It decouples memory allocation management from the app so you can use different allocators (for performance or debugging reasons).
It's stored internally in a location dependent on the language/compiler/OS.
Sometimes it is available (.Length in C# for example), though that may only refer to how much memory you're allowed to use, and not the object's total size.
Usually because the size to free is stored somewhere within the allocated buffer. A common technique is to have the size stored in memory just previous to the returned pointer.
Why isn't such information available to the programmer? I don't really know. I guess its because an implementation may be able to provide memory allocation without actually needing to store its size, and such implementation -if it exists- shouldn't be penalized by the others.
It's not so much language specific. It's all done by the memory manager.
How it knows depends on how the memory manager manages memory. The general idea is that the memory manager allocates more memory than you ask for. It stores extra data about the allocated blocks of memory in those locations. Thus, when you release the memory, it uses the information stored in those locations (reconstructed based on the given pointer) and figures out how much actual memory to stop managing.
Don't confound deallocation and destruction.
free() knows the size of the memory because of some internal magic ("implementation-defined"), e.g. the allocator could keep a list of all the allocated memory regions indexed by their corresponding pointers and just look up the pointer to know what to deallocate; or that information could be stored next to the allocated memory itself in some hidden block of data.
The array-delete expression delete[] arr; does not only deallocate memory, but it also invokes all destructors. For that purpose, it is not sufficient to just know the memory size, but we also need to know the number of elements. For that purpose, new T[N] actually allocates more than sizeof(T) * N bytes of memory, so the array-deleter knows how many destructors to call. All that memory is properly deallocated by the corresponding delete-operator.
i am student and want to know more about the dynamics memory management. For C++, calling operator new() can allocate a memory block under the Heap(Free Store ). In fact, I have not a full picture how to achieve it.
There are a few questions:
1) What is the mechanism that the OS can allocate a memory block?? As I know, there are some basic memory allocation schemes like first-fit, best-fit and worst-fit. Does OS use one of them to allocate memory dynamically under the heap?
2) For different platform like Android, IOS, Window and so on, are they used different memory allocation algorithms to allocate a memory block?
3) For C++, when i call operator new() or malloc(), Does the memory allocator allocate a memory block randomly in the heap?
Hope anyone can help me.
Thanks
malloc is not a system call, it is library (libc) routine which goes through some of its internal structures to give you address of a free piece of memory of the required size. It only does a system call if the process' data segment (i.e. virtual memory it can use) is not "big enough" according to the logic of malloc in question. (On Linux, the system call to enlarge data segment is brk)
Simply said, malloc provides fine-grained memory management, while OS manages coarser, big chunks of memory made available to that process.
Not only different platforms, but also different libraries use different malloc; some programs (e.g. python) use its internal allocator instead as they know its own usage patterns and can increase performance that way.
There is a longthy article about malloc at wikipedia.