Record of empty memory locations - memory

like a linked list I know that free space from different memory location is given .... But how do a compiler know a free space is available in a memory?
How do memory keeps record of free pool stuff is there any data structure used to store the free memory addresses like linked list

Related

Committed vs Reserved Memory

According to "Windows Internals, Part 1" (7th Edition, Kindle version):
Pages in a process virtual address space are either free, reserved, committed, or shareable.
Focusing only on the reserved and committed pages, the first type is described in the same book:
Reserving memory means setting aside a range of contiguous virtual addresses for possible future use (such as an array) while consuming negligible system resources, and then committing portions of the reserved space as needed as the application runs. Or, if the size requirements are known in advance, a process can reserve and commit in the same function call.
Both reserving or committing will initially get you entries in the VADs (virtual address descriptors), but neither operation will touch the PTE (page table entries) structures. It used to cost PTEs for reserving before Windows 8.1, but not anymore.
As described above, reserved means blocking a range of virtual addresses, NOT blocking physical memory or paging file space at the OS level. The OS doesn't include this in the commit limit, therefore when the time comes to allocate this memory, you might get a surprise. It's important to note that reserving happens from the perspective of the process address space. It's not that there's any physical resource reserved - there's no stamping of "no vacancy" against RAM space or page file(s).
The analogy with plots of land might be missing something: take reserved as the area of land surrounded by wooden poles, thus letting others now that the land is taken. But how about committed ? It can't be land on which structures (eg houses) have already been build, since those would require PTEs and there's none there yet, since we haven't accessed anything. It's only when touching committed data that the PTEs will get built, which will make the pages available to the process.
The main problem is that committed memory - at least in its initial state - is functionally very much alike reserved memory. It's just an area blocked within VADs. Try to touch one of the addresses, and you'll get an access violation exception for a reserved address:
Attempting to access free or reserved memory results in an access violation exception because the page isn’t mapped to any storage that can resolve the reference
...and an initial page fault for a committed one (immediately followed by the required PTE entries being created).
Back to the land analogy, once houses are build, that patch of land is still committed. Yet this is a bit peculiar, since it was still committed when the original grass was there, before the very first shovel was excavated to start construction. It resembled the same state as that of a reserved patch. Maybe it would be better to think of it like terrain eligible for construction. Eg you have a permit to build (albeit you might never build as much as a wall on that patch of land).
What would be the reasons for using one type of memory versus the other ? There's at least one: the OS guarantees that there will be room to allocate committed memory, should that ever occur in the future, but doesn't guarantee anything for reserved memory aside from blocking that process' address space range. The only downside for committed memory is that one or more paging files might need to be extended in size as to be able to make the commit limit take into account the recently allocated block, so should the requester demand the use of part of all the data in the future, the OS can provide access to it.
I can't really think how the land analogy can capture this detail of "guarantee". After all, the reserved patch also physically existed, covered by the same grass as a committed one in its pristine state.
The stack is another scenario where reserved and committed memory are used together:
When a thread is created, the memory manager automatically reserves a predetermined amount of virtual memory, which by default is 1 MB.[...] Although 1 MB is reserved, only the first page of the stack will be committed [...]
along with a guard page. When a thread’s stack grows large enough to touch the guard page, an exception occurs, causing an attempt to allocate another guard. Through this mechanism, a user stack doesn’t immediately consume all 1 MB of committed memory but instead grows with demand."
There is an answer here that deals with why one would want to use reserved memory as opposed to committed . It involves storing continuously expanding data - which is actually the stack model described above - and having specific absolute address ranges available when needed (although I'm not sure why one would want to do that within a process).
Ok, what am I actually asking ?
What would be a good analogy for the reserved/committed concept ?
Any other reason aside those depicted above that would mandate the
use of reserved memory ? Are there any interesting use cases when
resorting to reserved memory is a smart move ?
Your question hits upon the difference between logical memory translation and virtual memory translation. While CPU documentation likes to conflate these two concepts, they are different in practice.
If you look at logical memory translation, there are are only two states for a page. Using your terminology, they are FREE and COMMITTED. A free page is one that has no mapping to a physical page frame and a COMMITTED page has such a mapping.
In a virtual memory system, the operating system has to maintain a copy of the address space in secondary storage. How this is done depends upon the operating system. Typically, a process will have its mapping to several different files for secondary storage. The operating system divides the address space into what is usually called a SECTION.
For example, the code and read only data could be stored virtually as one or more SECTIONS in the executable file. Code and static data in shared libraries could each be in a different section that are paged to the shared libraries. You might have a map to a shared filed to the process that uses memory that can be accessed by multiple processes that forms another section. Most of the read/write data is likely to be in a page file in one or more sections. How the operating system tracks where it virtually stores each section of data is system dependent.
For windows, that gives the definition of one of your terms: Sharable. A sharable section is one where a range of addresses can be mapped to different processes, at different (or possibly the same) logical addresses.
Your last term is then RESERVED. If you look at the Windows' VirtualAlloc function documentation, you can see that (among your options) you can RESERVE or COMMIT. If you reserve you are creating a section of VIRTUAL MEMORY that has no mapping to physical memory.
This RESERVE/COMMIT model is Windows-specific (although other operating systems may do the same). The likely reason was to save disk space. When Windows NT was developed, 600MB drives the size of washing machine were still in use.
In these days of 64-bit address spaces, this system works well for (as you say) expanding data. In theory, an exception handler for a stack overrun can simply expand the stack. Reserving 4GB of memory takes no more resources than reserving a single page (which would not be practicable in a 32-bit system—see above). If you have 20 threads, this makes reserving stack space efficient.
What would be a good analogy for the reserved/committed concept ?
One could say RESERVE is like buying options to buy and COMMIT is exercising the option.
Any other reason aside those depicted above that would mandate the use of reserved memory ? Are there any interesting use cases when resorting to reserved memory is a smart move ?
IMHO, the most likely places to RESERVE without COMMITTING are for creating stacks and heaps with the former being the most important.

Mapping and allocating

I am little confused with term mapping, for example, when we say mapping memory for database, it means that we assigning specific amount of memory at some memory location to that database?
Also is allocating memory synonym for reserving memory?
Very often I encounter these two terms, and they aren't so clear to me.
If someone can clarify these two terms, I will be very thankful.
This might be a question better asked to the software community at stackoverflow. However, I am a CS.
I would say that terms aren't always used accurately and precisely.
In general allocating memory is making memory available to a program for an active purpose, such as allocating memory for buffers to hold a file or in in-memory structure now.
Reserving memory is often used to mean the same thing. However, it is sometimes more passive. For example reserving memory in case their is a future requirement, or protecting against too much memory allocation for a different purpose.
Often when the term 'mapping' is used, it is for a file. It may mean exactly the same as allocating. Or it means more; mapping may be using an underlying mechanism provided by virtual memory management systems, where part of virtual memory is 'mapped' to the file, without actually reading the file into physical memory. The trick is, as the memory-mapped file is accessed, the block/page being accessed is read in 'invisibly' to the process when necessary. This uses a mechanism called demand paging. It's benefit is a program can access the file as if it is all read into memory, but only the parts actually accessed are retrieved from the persistent storage system (disk, flash, whatever), which can be a huge win if only small parts of the file are needed.
Further, it simplifies the program, which can be written as if the whole file is in memory. Instead of the application developer trying to keep track of which parts of the file have been loaded into memory, the operating system does that instead.
Even better, the Operating system can be asked to track which blocks/pages have their contents changed, and it can be asked to periodically write that back out to persistent storage. This can even further simplify the application program.
This is popular with some databases.
Mapping basically means assigning. Except we often want a 1 to 1 mapping in the case of functions. If you define the function of an object, physical or just logical, and define it's relationships and how it changes under transformation then you have mapped it.

Clear created Object memory

i'm using WMI SMBios to get some hardware information
check uSMBios.pas
i don't wanna users see what is the used serial numbers in memory so i'm trying to clear it
when i call
SMBios:=TSMBios.Create;
//my code
SMBios.free;
the SMBios Object still in memory in many locations
i tried this code on Destroy Event
if Assigned(FRawSMBIOSData.SMBIOSTableData) then
begin
ZeroMemory(FRawSMBIOSData.SMBIOSTableData,FRawSMBIOSData.Length);
FreeMem(FRawSMBIOSData.SMBIOSTableData);
end;
it working great with GetSystemFirmwareTable API code in SMBios but in WMI it removes some memory but still i can find few blocks
wondering why after calling object.free or freeandnil the used memory not released
any idea how to force the application to free it ?
Memory is released, it is just not wiped. You maybe mistake two concepts: the memory is bound to some owner and cannot be given to another one, and memory is cleansed of all the information.
Look, when you go over fresh snow or over sands, you leave your footsteps behind you. You moved away, so the places you've been through are FREE now for anyone else to occupy. But your footsteps remain there until someone would overwrite them with his own ones.
Now, you may be paranoid and after every step you would turn back, take a brush and remove your fresh footstep. That is possible and might make sense, but it might be painfully slow.
Some objects might deal with sensitive data, like passwords, cipher keys, personal data in mass calculations, etc. For those objects there is the sense to be paranoid and brush out every their trace. So those objects are written in a way to wipe the memory they would no more need immediately after last use. And to do it once again in the destructor.
But when you just closed the form with the message like "file saved successfully" there ain't any secrets worth painting over. And that is the most of the program.
So now please decide if you really have some sensitive data like passwords. If you do - your code should overwrite it with different data before freeing. And you would have to learn how the data is kept for different types in Delphi, so pieces of the data would not be copied in other places of memory during your processing of them. But most probably you don't need the data actually destroyed, you only need to mark "this place is FREE for anyone to put their data over my garbage" and that is what freeing object on Delphi actually does. If that is enough for you just don't you bother to wipe the data (which is substituting random garbage instead of sensitive garbage, but a garbage still).
Now, few words about suggestions of LU RD and whosrdaddy. Yes, Delphi provides you means to hook into the way heap is managed and to explicitly wipe the data with garbage before marking the apartment free. However this is only a partial solution for sensitive data.
99,9% of times you would be clearing data that was not worth it. Strings, dynamic arrays, TList and other containers would be slow - and your program too.
your app consists of procedures, that have local variables. Many of those variables, like Short Strings, fixed size arrays, GUIDs, are allocated on stack rather than in heap. Those suggestions would not clean them, only free.
your objects typically allocate memory in Delphi heap. But they might also allocate it otherwise. In Windows heap, in some multithreading-aware pool, or whatever. That memory would not be wiped modifying default Delphi heap manager behavior.
Overall it is the same idea. Your procedure or your object knows which data is dangerous and where it is kept - that object or procedure is responsible of cleansing. Global Delphi-scale solutions would be both ineffective and unreliable.

memory management and segmentation faults in modern day systems (Linux)

In modern-day operating systems, memory is available as an abstracted resource. A process is exposed to a virtual address space (which is independent from address space of all other processes) and a whole mechanism exists for mapping any virtual address to some actual physical address.
My doubt is:
If each process has its own address space, then it should be free to access any address in the same. So apart from permission restricted sections like that of .data, .bss, .text etc, one should be free to change value at any address. But this usually gives segmentation fault, why?
For acquiring the dynamic memory, we need to do a malloc. If the whole virtual space is made available to a process, then why can't it directly access it?
Different runs of a program results in different addresses for variables (both on stack and heap). Why is it so, when the environments for each run is same? Does it not affect the amount of addressable memory available for usage? (Does it have something to do with address space randomization?)
Some links on memory allocation (e.g. in heap).
The data available at different places is very confusing, as they talk about old and modern times, often not distinguishing between them. It would be helpful if someone could clarify the doubts while keeping modern systems in mind, say Linux.
Thanks.
Technically, the operating system is able to allocate any memory page on access, but there are important reasons why it shouldn't or can't:
different memory regions serve different purposes.
code. It can be read and executed, but shouldn't be written to.
literals (strings, const arrays). This memory is read-only and should be.
the heap. It can be read and written, but not executed.
the thread stack. There is no reason for two threads to access each other's stack, so the OS might as well forbid that. Moreover, the tread stack can be de-allocated when the tread ends.
memory-mapped files. Any changes to this region should affect a specific file. If the file is open for reading, the same memory page may be shared between processes because it's read-only.
the kernel space. Normally the application should not (or can not) access that region - only kernel code can. It's basically a scratch space for the kernel and it's shared between processes. The network buffer may reside there, so that it's always available for writes, no matter when the packet arrives.
...
The OS might assume that all unrecognised memory access is an attempt to allocate more heap space, but:
if an application touches the kernel memory from user code, it must be killed. On 32-bit Windows, all memory above 1<<31 (top bit set) or above 3<<30 (top two bits set) is kernel memory. You should not assume any unallocated memory region is in the user space.
if an application thinks about using a memory region but doesn't tell the OS, the OS may allocate something else to that memory (OS: sure, your file is at 0x12341234; App: but I wanted to store my data there). You could tell the OS by touching the end of your array (which is unreliable anyways), but it's easier to just call an OS function. It's just a good idea that the function call is "give me 10MB of heap", not "give me 10MB of heap starting at 0x12345678"
If the application allocates memory by using it then it typically does not de-allocate at all. This can be problematic as the OS still has to hold the unused pages (but the Java Virtual Machine does not de-allocate either, so hey).
Different runs of a program results in different addresses for variables
This is called memory layout randomisation and is used, alongside of proper permissions (stack space is not executable), to make buffer overflow attacks much more difficult. You can still kill the app, but not execute arbitrary code.
Some links on memory allocation (e.g. in heap).
Do you mean, what algorithm the allocator uses? The easiest algorithm is to always allocate at the soonest available position and link from each memory block to the next and store the flag if it's a free block or used block. More advanced algorithms always allocate blocks at the size of a power of two or a multiple of some fixed size to prevent memory fragmentation (lots of small free blocks) or link the blocks in a different structures to find a free block of sufficient size faster.
An even simpler approach is to never de-allocate and just point to the first (and only) free block and holds its size. If the remaining space is too small, throw it away and ask the OS for a new one.
There's nothing magical about memory allocators. All they do is to:
ask the OS for a large region and
partition it to smaller chunks
without
wasting too much space or
taking too long.
Anyways, the Wikipedia article about memory allocation is http://en.wikipedia.org/wiki/Memory_management .
One interesting algorithm is called "(binary) buddy blocks". It holds several pools of a power-of-two size and splits them recursively into smaller regions. Each region is then either fully allocated, fully free or split in two regions (buddies) that are not both fully free. If it's split, then one byte suffices to hold the size of the largest free block within this block.

How does a program know how much memory to release?

I suspect the answer to my question is language specific, so I'd like to know about C and C++. When I call free() on a buffer or use delete[], how does the program know how much memory to free?
Where is the size of the buffer or of the dynamically allocated array stored and why isn't it available to the programmer as well?
Each implementation will be different, but typically the runtime allocates a bit more than asked for, and uses some hidden fields at the start of the block to remember the allocated size. The address returned to the caller is therefore offset a bit from the start of the memory claimed from the heap.
It isn't available to the caller because the true amount of memory claimed from the heap is an implementation detail, and will vary between compilers and platforms. As for knowing how much the caller asked for, rather than how much was allocated from the heap... well, the language designers assume that the programmer is capable of remembering this if needed.
The heap keeps track of all memory blocks, both allocated and free, specifically for that purpose. Typical (if naive) implemenation allocates memory, uses several bytes in the beginning for bookkeeping, and returns the address past those bytes. On subsequent operations (free/realloc), it would subtract a few bytes to get to the bookkeeping area.
Some heap implementations (say, Windows' GlobalAlloc()) let you know the block size given the starting address. But in the C/C++ RTL heap, no such service.
Note that the malloc() sometimes overallocates memory, so the information about mallocated block size would be of limited utility. C++ new[]'ed arrays, that's a whole another matter - for those, knowing exact array size is essential for array destruction to work properly. Still, there's no such thing in C++ as a dynamic_sizeof operator.
The memory allocator that gave you that chunk of memory is responsible for all that maintenance data. Typically it's stored in the beginning of the chunk (right before the actual address you use) so it's easy to access on freeing.
Regarding to your other question: why should your app know about it? It's not your concern. It decouples memory allocation management from the app so you can use different allocators (for performance or debugging reasons).
It's stored internally in a location dependent on the language/compiler/OS.
Sometimes it is available (.Length in C# for example), though that may only refer to how much memory you're allowed to use, and not the object's total size.
Usually because the size to free is stored somewhere within the allocated buffer. A common technique is to have the size stored in memory just previous to the returned pointer.
Why isn't such information available to the programmer? I don't really know. I guess its because an implementation may be able to provide memory allocation without actually needing to store its size, and such implementation -if it exists- shouldn't be penalized by the others.
It's not so much language specific. It's all done by the memory manager.
How it knows depends on how the memory manager manages memory. The general idea is that the memory manager allocates more memory than you ask for. It stores extra data about the allocated blocks of memory in those locations. Thus, when you release the memory, it uses the information stored in those locations (reconstructed based on the given pointer) and figures out how much actual memory to stop managing.
Don't confound deallocation and destruction.
free() knows the size of the memory because of some internal magic ("implementation-defined"), e.g. the allocator could keep a list of all the allocated memory regions indexed by their corresponding pointers and just look up the pointer to know what to deallocate; or that information could be stored next to the allocated memory itself in some hidden block of data.
The array-delete expression delete[] arr; does not only deallocate memory, but it also invokes all destructors. For that purpose, it is not sufficient to just know the memory size, but we also need to know the number of elements. For that purpose, new T[N] actually allocates more than sizeof(T) * N bytes of memory, so the array-deleter knows how many destructors to call. All that memory is properly deallocated by the corresponding delete-operator.

Resources