Pinned memory in CUDA - memory

I read somewhere that pinned memory in CUDA is scarce source. What is upper bound on pinned memory? In windows, in linux?

Pinned memory is just physical RAM in your system that is set aside and not allowed to be paged out by the OS. So once pinned, that amount of memory becomes unavailable to other processes (effectively reducing the memory pool available to rest of the OS).
The maximum pinnable memory therefore is determined by what other processes (other apps, the OS itself) are competing for system memory. What processes are concurrently running in either Windows or Linux (e.g. whether they themselves are pinning memory) will determine how much memory is available for you to pin at that particular time.

Related

What If a processes don't fit in memory?

If processes don’t fit in memory, What moves them in and out of memory to run?
this question is based on Operating System Memory management theory.
I have checked about the purpose of memory management unit. Is this related to swapping?
The operating system will use a memory management technique called virtual memory.
This is when a computer compensates for shortages of physical memory by temporarily transferring pages (segments of memory) of data from RAM to backing store. RAM is much faster than secondary storage and when a computer needs to use secondary storage over primary the user will feel the computer running slower.
The operating systems virtual memory manager is responsible for managing this. It will use techniques such as placing pages that have not been referenced for in a while into secondary memory (you hard disk for example) and if a page in secondary storage is required it will move the page from secondary to primary memory.
Another point is that most modern apps will page themselves, such as when they are minimised for example, to reduce the amount of memory they're using for other applications running.

Does the operating system itself issue virtual memory addresses?

An operating system itself has resources it needs to access, like block I/O cache and process control blocks. Does it use virtual memory addresses or physical memory addresses?
I feel like it should be the former since it prevents the need to keep a large area of physical memory for a purpose, even when it is mostly empty. The mechanism of page tables/virtual memory would do a much better job at keeping those resources that the OS really needs.
So which is it?
10 randomly selected operating systems will do virtual memory management in 10 different ways. There's no answer that applies to all operating systems.
Some (e.g. MS-DOS) don't support or use virtual memory management for anything, some (e.g. Linux) just map all of physical memory into kernel space and don't bother using virtual memory management tricks for the kernel itself (it's almost as if the kernel is in physical memory even though it's technically both), and some may do any number of virtual memory tricks in kernel space.

After boot does Linux reclaims Tianocore boot loader memory

I am using Tianocore for booting Linux, I understand that Linux can avail Tianocore Runtime services (reboot, update_capsule etc.), it means that
some part of Tianocore code remains untouched by linux. Linux will never touch that memory.
My question, is it some part of Tianocore code (related to Runtime Services) or the whole of Tianocore remains untouched by Linux kernel even after boot ?
and, how does Linux kernel comes to know about memory areas that contain Tianocore image ?
There are many memory types that can be allocated by UEFI implementation (using AllocatePool or AllocatePages BootServices), and some of them will remain untouched by UEFI-aware OS, others will be freed. All regions of memory that shouldn't be freed will also be added to e820 memory map to prevent legacy OSes from corrupting them.
Normally, only a small portion of allocated memory is not freed after ExitBS event: runtime services code and data, ACPI tables and MMIO regions.

How os handle fragmentation in virtual address space

As far as I know , the paging system do eliminate external fragment in physical address space, but what about fragment in virtual address space?
In modern OSes the virtual address space is used per process (the kernel has it's own dedicated virtual range), which means that the demands are much lower compared to the whole OS. The virtual address space is usually large enough (2-3 GB per process on x86 and multiple TB (8 on Windows) on x64 machines), so that fragmentation is not such a big issue as for the OS-wide physical address space. Still the issue can arise, especially for long running and memory hungry applications on x86 or other 32 bit architectures. For this the OS provides mechanisms, for example in form of the heap code. An application usually reserves one or more memory ranges as heap(s) when it starts and allocates the required chunks of memory from there later (e.g. malloc). There are a varity of implementations that handle fragmentation of the heap in different ways. Windows provides a special low-fragmentation heap implementation that can be used, if desired. Everything else is usually up to the application or it's libraries.
Let me add a qualification to your statement. Paging systems nearly eliminate fragmentation in the physical address space when the kernel is pageable.
On some systems, the user mode page tables are themselves pageable. On others, they are are physical locations that are not pageable. Then you can get fragmentation.
Fragmentation in the virtual address space tends to occur in heap allocation. The challenge of heap managers is to manage the space while minimizing fragmentation.

When does a Windows process run out of memory?

Under Windows Server 2003, Enterprise Edition, SP2 (/3GB switch not enabled)
As I understand it, and I may be wrong, the maximum addressable memory for a process is 4GB.
Is that 2GB of private bytes and 2GB of virtual bytes?
Do you get "out of memory" errors when the private byte limit or virtual byte limit is reached?
It is correct that the maximum address space of a process is 4GB, in a sense. Half of the address space is, for each process, taken up by the operating system. This can be changed with the 3GB switch but it might cause system instability. So, we are left with 2GB of addressable memory for the process to use on its own. Well, not entirely. It turns out that a part of this space is taken up by other stuff such as DLLs an other common code. The actual memory available to you as a programmer is around 1.5GB - 1.7GB.
I'm not sure about how you can handle accidentally going above this limit but I know of games which crash in large multiplayer maps for this reason. Another thing to note is that a 32bit program cannot use more than the 2GB address space on a 64bit system unless they enable the /LARGEADDRESSAWARE:YES linker flag.
Mark Russinovich started a series of posts on this..
Pushing the Limits of Windows: Physical Memory
While 4GB is the licensed limit for 32-bit client SKUs, the effective limit is actually lower and dependent on the system's chipset and connected devices. The reason is that the physical address map includes not only RAM, but device memory as well, and x86 and x64 systems map all device memory below the 4GB address boundary to remain compatible with 32-bit operating systems that don't know how to handle addresses larger than 4GB. If a system has 4GB RAM and devices, like video, audio and network adapters, that implement windows into their device memory that sum to 500MB, 500MB of the 4GB of RAM will reside above the 4GB address boundary.
You can only access 2Gb of memory in total (without the 3Gb switch) on 32bit Windows platforms.
You could run multiple 32bit VMs on a 64bit OS so that each app has access to as much memory as possible if your machine has more than 4Gb.
A lot of people are just starting to hit these barriers, I guess it's easier if your app is in .net or Java as the VMs happily go up to 32Gb of memory on 64bit os.
On 32 bits, if there is enough physical memory and disk space for virtual memory, memory runs out around 3GB since the kernel reserves the address space above 0xC0000000 for itself. On a 64 bits kernel running a 64 bits application, the limit is at 8TB.
For more details, check out MSDN - Memory Limits for Windows Releases
Maximum addressable memory for a 32bit machine is 4GB, for a 64bit machine you can address loads more. (Although some 32bit machines have extension systems for accessing more, but I don't think this is worth bothering with or considering for use).
You get out of memory errors when the virtual limit is reached. On Windows Server 2003, task manager tells you the limit on the performance tab labelled 'Commit Charge Limit'.

Resources