How is disk memory being used/consumed by programs? - memory

A dummy question:
Recently my disk ran out of memory:
I kept getting java.OutOfMemoryError, java heap space, later my Virtual Box encountered "Not Enough Free Space available on disk" error.
Then it turned out that my 256GB SSD had been almost all consumed/used.
So I was wondering how running the programs could consume my memory/disk usage?
How does this work?
I know the basics behind this, allocating space on a heap/stack, then deallocating them after use. (Correct me if I'm wrong.)
But if this is the case, then the disk should not be used up, right? (if I don't add anything else onto my desktop, only using it to run a definite number of programs)
I really wanted to understand how the disk/memory is being consumed/used by running programs.
If this question has been asked before, please relate it to that one.
I apologize for dummy question, but I believe it will be helpful to fellow programmers like me.
Thanks for making it clearer. Q1: Why do programs consume disk space? A2: How does "java.OutOfMemoryError, java heap space" occur? related to memory, is it?

Why do programs consume disk space?
I know the basics behind this, allocating space on a heap/stack, then deallocating them after use. But if this is the case, then the disk should not be used up, right?
In fact, it can be used up. Memory allocations can consume hard-disk space if the allocation in your process's virtual memory happens to be mapped to a pagefile on disk, and your pagefile size is set to be managed by the operating system.
If you want to know more about memory mapping there's a great question here:
Understanding Virtual Address, Virtual Memory and Paging
The page-file grow won't actually be a direct response to your allocation, more a response to the new current commit size being close to the reserved size. If you want to know more about this process (commit vs reserved, stack expansions, etc) I recommend reading Pushing the Limits of Windows: Physical Memory.
Why does java.OutOfMemoryError occur?
http://docs.oracle.com/javase/7/docs/api/java/lang/OutOfMemoryError.html
Thrown when the Java Virtual Machine cannot allocate an object because it is out of memory, and no more memory could be made available by the garbage collector.
Generally this happens because your pagefile is too small or your disk is too full.
See also:
How to deal with "java.lang.OutOfMemoryError: Java heap space" error (64MB heap size)
java.lang.OutOfMemoryError: Java heap space

Related

How OS handles memory leaks

I searched quite a lot for the question but was unable to find my exact query although it seems general enough that might have been asked and answered somewhere.
I wanted to know what happens after a process causes a memory leak and terminates. In my opinion it's not big deal because of virtual memory. After all physical pages can be still allocated to other/new process even if it was causing memory leak earlier (after old process caused memory leak)
But I also read somewhere that due to memory leaks you need to restart your system, and I dont seem to understand why???
Recommended reading : Operating Systems: Three Easy Pieces
On common OSes (e.g. Linux, Windows, MacOSX, Android) each process has its own virtual address space (and the heap memory, e.g. used for malloc or mmap, is inside that virtual address space), and when the process terminates, its entire virtual address space is destroyed.
So memory leaks don't survive the process itself.
There could be subtle corner cases (e.g. leaks on using shm_overview(7) or shmget(2)).
Read (for Linux) proc(5), try cat /proc/self/maps, and see also this. Learn to use valgrind and the Address Sanitizer.
Read also about Garbage Collection. It is quite relevant.
In modern operating systems the address space is divided into a user space and an system space. The system space is the same for all processes.
When you kill a process, that destroys the user space for the process. If an application has a memory leak, killing the process remedies the leak.
However,the operating system can also allocate memory in the system space. When there is a memory leak in the operating system's allocation of system space memory, killing processes does not free it up.
That is the type of memory leak that forces you to reboot the system.

why cannot access to contiguous memory addresses in physical memory

According to Microsoft documentation in the following link :
https://msdn.microsoft.com/en-us/library/windows/hardware/hh439648%28v=vs.85%29.aspx
A program can use a contiguous range of virtual addresses to access a
large memory buffer that is not contiguous in physical memory.
So there's a question,that why in physical memory cannot have contiguous memory for a process?
Also there's another question due to the documentation, the following picture which demonstrates virtual memory for user and system space:
The system virtual address space is unique in the whole of the memory but there's a virtual address space for each process ?
Thanks.
At first when a process is loaded into memory, the OS can optimize to load process pages contiguously to physical memory.The process pages in memory cant always be contiguous due to swapping in and out, because there are other processes and things in memory that occupy space,so if later when some process pages becomes less used it is swapped back to hard drive, and when it is needed again it is not guaranteed to be loaded to the same spot before swapping out because there can be another process page laying there. You should read about virtual memory to gain good understanding of all of this.
You'r Questionn is simple!you have asked why we can have large memory buffer in virtual memory but not in physical one! thats because we are limited to the hardware!if we were able to access as much as buffer we want on our physical memory,industries had to make like 1024GB memories for our satisfaction! but we are using 8GB memory and we are satisfy...!virtual memories exist to satisfy our needs and make hardwares much more efficient!
hope it helps <3

memory management and segmentation faults in modern day systems (Linux)

In modern-day operating systems, memory is available as an abstracted resource. A process is exposed to a virtual address space (which is independent from address space of all other processes) and a whole mechanism exists for mapping any virtual address to some actual physical address.
My doubt is:
If each process has its own address space, then it should be free to access any address in the same. So apart from permission restricted sections like that of .data, .bss, .text etc, one should be free to change value at any address. But this usually gives segmentation fault, why?
For acquiring the dynamic memory, we need to do a malloc. If the whole virtual space is made available to a process, then why can't it directly access it?
Different runs of a program results in different addresses for variables (both on stack and heap). Why is it so, when the environments for each run is same? Does it not affect the amount of addressable memory available for usage? (Does it have something to do with address space randomization?)
Some links on memory allocation (e.g. in heap).
The data available at different places is very confusing, as they talk about old and modern times, often not distinguishing between them. It would be helpful if someone could clarify the doubts while keeping modern systems in mind, say Linux.
Thanks.
Technically, the operating system is able to allocate any memory page on access, but there are important reasons why it shouldn't or can't:
different memory regions serve different purposes.
code. It can be read and executed, but shouldn't be written to.
literals (strings, const arrays). This memory is read-only and should be.
the heap. It can be read and written, but not executed.
the thread stack. There is no reason for two threads to access each other's stack, so the OS might as well forbid that. Moreover, the tread stack can be de-allocated when the tread ends.
memory-mapped files. Any changes to this region should affect a specific file. If the file is open for reading, the same memory page may be shared between processes because it's read-only.
the kernel space. Normally the application should not (or can not) access that region - only kernel code can. It's basically a scratch space for the kernel and it's shared between processes. The network buffer may reside there, so that it's always available for writes, no matter when the packet arrives.
...
The OS might assume that all unrecognised memory access is an attempt to allocate more heap space, but:
if an application touches the kernel memory from user code, it must be killed. On 32-bit Windows, all memory above 1<<31 (top bit set) or above 3<<30 (top two bits set) is kernel memory. You should not assume any unallocated memory region is in the user space.
if an application thinks about using a memory region but doesn't tell the OS, the OS may allocate something else to that memory (OS: sure, your file is at 0x12341234; App: but I wanted to store my data there). You could tell the OS by touching the end of your array (which is unreliable anyways), but it's easier to just call an OS function. It's just a good idea that the function call is "give me 10MB of heap", not "give me 10MB of heap starting at 0x12345678"
If the application allocates memory by using it then it typically does not de-allocate at all. This can be problematic as the OS still has to hold the unused pages (but the Java Virtual Machine does not de-allocate either, so hey).
Different runs of a program results in different addresses for variables
This is called memory layout randomisation and is used, alongside of proper permissions (stack space is not executable), to make buffer overflow attacks much more difficult. You can still kill the app, but not execute arbitrary code.
Some links on memory allocation (e.g. in heap).
Do you mean, what algorithm the allocator uses? The easiest algorithm is to always allocate at the soonest available position and link from each memory block to the next and store the flag if it's a free block or used block. More advanced algorithms always allocate blocks at the size of a power of two or a multiple of some fixed size to prevent memory fragmentation (lots of small free blocks) or link the blocks in a different structures to find a free block of sufficient size faster.
An even simpler approach is to never de-allocate and just point to the first (and only) free block and holds its size. If the remaining space is too small, throw it away and ask the OS for a new one.
There's nothing magical about memory allocators. All they do is to:
ask the OS for a large region and
partition it to smaller chunks
without
wasting too much space or
taking too long.
Anyways, the Wikipedia article about memory allocation is http://en.wikipedia.org/wiki/Memory_management .
One interesting algorithm is called "(binary) buddy blocks". It holds several pools of a power-of-two size and splits them recursively into smaller regions. Each region is then either fully allocated, fully free or split in two regions (buddies) that are not both fully free. If it's split, then one byte suffices to hold the size of the largest free block within this block.

Memory defragmentation software. How does it work? Does it work?

I was reading an article on memory fragmentation when I recalled that there are several examples of software that claim to defragment memory. I got curious, how does it work? Does it work at all?
EDIT:
xappymah gave a good argument against memory defragmentation in that a process might be very surprised to learn that its memory layout suddenly changed. But as I see it there's still the possibility of the OS providing some sort of API for global memory control. It does seem a bit unlikely however since it would give rise to the possibility of using it in malicious intent, if badly designed. Does anyone know if there is an OS out there that supports something of the sort?
The real memory defragmentation on a process level is possible only in managed environments such as, for example, Java VMs when you have some kind of an access to objects allocated in memory and can manage them.
But if we are talking about the unmanaged applications then there is no possibility to control their memory with third-party tools because every process (both the tool and the application) runs in its own address space and doesn't have access to another's one, at least without help from OS.
However even if you get access to another process's memory (by hacking your OS or else) and start modifying it I think the target application would be very "surprised".
Just imagine, you allocated a chunk of memory, got it's starting address and on the next second this chunk of memory is moved somewhere else because of "VeryCoolMemoryDefragmenter" :)
In my opinion memory it's a kind of Flash Drive, and this chip don't get fragmented because there aren't turning disks pins recording and playing information, in a random way, like a lie detector. This is the way that Hard Disk Fragmentation it's done. That's why SSD drives are so fast, effective, reliable and maintenance free. SSD it's a BIG piece of memory and it kind of look alike.

Checking the amount of available RAM within a running program

A friend of mine was asked, during a job interview, to write a program that measures the amount of available RAM. The expected answer was using malloc() in a binary-search manner: allocating larger and larger portions of memory until getting a failure message, reducing the portion size, and summing the amount of allocated memory.
I believe that this method will measure the amount of virtual, not physical, memory. But I got curious about the matter.
Is there a way to tell the amount of available RAM from within the program, without using exec(dmesg |grep -i memory) ?
You are correct: malloc() makes no distinction between physical or virtual memory. In fact, that's the whole point of virtual memory: to make such details irrelevant to programs.
You can find out but it is OS-specific. For example, Linux.
The only way to do this is to use some OS-specific functionality. Using malloc() is useless for a number of reasons:
it measures virtual memory
the OS may well have per-process cap on memory allocations
allocating much more memory than is physically available often degrades the platforms stability to the point where "go back one" algorithm suggested in the question probably won't work
this is OS specific and you should collect such information from the OS services unless you want to make your own memory management layer
Using malloc() will only tell you how much memory can be allocated to a single process. There may be reasons why this is lower than the total amount of virtual memory. For instance, you might have OS quota or a per-process 32-bit-limited address space.
(And, of course, virtual memory >= RAM)
Very OS specific but for Linux the information about system memory is in /proc/meminfo. You can also probably use the sysctl interface (http://www.linuxjournal.com/article/2365) to get this data in a C program.

Resources