I use Delphi 2007, so there is a 32-bit limit of available memory.
Using the IMAGE_FILE_LARGE_ADDRESS_AWARE PE flag, there should be a 3 GB limit instead of 2 GB:
{$SetPEFlags IMAGE_FILE_LARGE_ADDRESS_AWARE} // Allows usage of more than 2GB memory
This is the method I use to get the current memory usage of the process:
function MemoryUsed: Int64;
var
PMC: _PROCESS_MEMORY_COUNTERS_EX;
begin
Win32Check(GetProcessMemoryInfo(GetCurrentProcess, #PMC, SizeOf(PMC)));
Result := PMC.PrivateBytes;
end;
Now I want a way to get the total amount of available memory for the process. It should be around 3 GB. But I don't want to hardcode it, as in the future we will move to new Delphi and 64-bit.
What Win32 API function should I use?
Available memory - Computers available memory - Maybe 8 GB RAM is installed. If more is required the OS start to swap memory to disk.
Process available memory - A limitation in executable and Windows. Now most Windows is 64-bit so that is not a problem. But if executable is compiled as 32-bit with IMAGE_FILE_LARGE_ADDRESS_AWARE, the limit should be 3 GB, right? When executable is 64-bit, it will be much larger, maybe 64 GB (but then swapping may happen if installed RAM is less...).
So my question is, how can I get the process's available memory?
There are a couple of obvious things you can do. Call GetSystemInfo and subtract lpMinimumApplicationAddress from lpMaximumApplicationAddress to find the amount of address space available to your process.
The amount of physical memory available to you is much harder to obtain, and is not a fixed quantity. You are competing with all the other processes for that, and so this is a very fluid and dynamic concept. You can find out how much physical memory is available on the system by calling GlobalMemoryStatusEx. That returns other information too but it's very easy to misinterpret it. In fact this API will also tell you how much virtual memory is available to your process which would give you the same information as in the first paragraph.
Perhaps what you want is the minimum of the total physical and total virtual memory. But I would not like to say. I've seen many examples of code that needlessly limits its ability to perform by taking bad decisions based on misinterpreted memory statistics.
Related
I want to print the memory location (address) of a variable with:
let x = 1;
println!("{:p}", &x);
This prints the hex value 0x7fff51ef6380 which in decimal is 140734568031104.
My computer has 16GB of RAM, so why this huge number? Does the x64 architecture use a big interval sequence instead of just simple 1 increment for accessing memory location?
In x86, usually the first location starts at 0, then 1, 2, etc. so the highest number you can have is around 4 billion, so the address number was always equals or less than 4 billion.
Why is this not the case with x64?
What you see here is an effect of virtual memory. Memory management is hard and it becomes even harder when the operating system and tens of hundreds of processes have to share the memory. In order to handle this huge complexity, the concept of virtual memory was used. I'll just briefly explain the basics here; the topic is far more complex and you should read about it somewhere else, too.
On most modern computers, each process thinks that it owns (almost) the complete memory space. But processes never deal with physical addresses, but with virtual ones. These virtual addresses are mapped to physical ones each time the process actually reads from memory. This translation of addresses is done by the so called MMU (memory management unit). The rules for how to map the addresses are setup by the operating system.
When you boot your PC, the operating system creates an initial mapping. Every time you start a process, the operating system adds a few slices of physical memory to the process and modifies the mapping appropriately. That way, the process has memory to play with.
On x86_64, the address space is 64 bit wide, so each process thinks it owns all of those 2^64 addresses. This is not true, of course:
There isn't a single PC on the world with that much memory. (In fact, most CPUs today can merely use 280 TB of RAM, since they internally can only use 48bit for addressing physical memory. And even these 280TB are enough for now, apparently.)
Even if you had that much memory, there are other processes which use part of that memory, too.
So what happens when you try to read an address which isn't mapped (which in 64bit land, are the vast majority of the addresses)? The MMU triggers a page fault. This makes the CPU notify the operating system to handle this.
What I mean is that in x86, usually first location starts at 0, then 1, 2, etc. so the highest number you can have is around 4 billion.
That is true, but it is also true if your x86 system has less than 4GB of RAM. Virtual memory exists for quite some time already.
So that's a short summary of why you see such big addresses. Again, please note that I glossed over many details here.
The pointers your program works with are in virtual address space. x86-64 uses 64-bit pointers. This was one of the major goals of AMD64, along with adding more integer and XMM registers. You are correct that i386 only has 32-bit pointers which only cover 4GB of address space in each process.
0x7fff51ef6380 looks like a stack pointer, which I guess makes sense for that code.
Linux on x86-64 (for example) puts the stack near the top of the lower canonical address range: current x86-64 hardware only implements 48-bit virtual addresses and this is the mechanism to prevent software from depending on it. This allows the address space to be extended in the future without breaking software.
The amount of phyiscal RAM in your system has nothing to do with this. You'd see (approximately) the same number on an x86-64 system with 128MB of RAM, +/- stack address space layout randomization (ASLR).
I have a 32-bit Delphi application running with /LARGEADDRESSAWARE flag on. This allows to allocate up to 4GB on a 64-bit system.
Am using threads (in a pool) to process files where each task loads a file in memory. When multiple threads are running (multiple files being loaded), at some point the EOutOfMemory hits me.
What would be the proper way to get the available address space so I can check if I have enough memory before processing the next file?
Something like:
if TotalMemoryUsed {from GetMemoryManagerState} + FileSize <
"AvailableUpToMaxAddressSpace" then NoOutOfMemory
I've tried using
TMemoryStatusEx.ullAvailVirtual for AvailableUpToMaxAddressSpace
but the results are not correct (sometimes 0, sometimes > than I actually have).
I don't think that you can reasonably and robustly expect to be able to predict ahead of time whether or not memory allocations will fail. At the very least you would probably need to write your own memory allocator that was dedicated to serving your application, and have a very strong understanding of the heap allocation requirements of your process.
Realistically the tractable way forward for you is to break free from the shackles of 32 bit address space. That is your fundamental problem. The way to escape from 32 bit address space is to compile for 64 bit. That requires XE2 or later.
You may need to continue supporting 32 bit versions of your application because you have users that are still on 32 bit systems. The modern versions of Delphi have 32 bit and 64 bit compilers and it is quite simple to write code that will compile and behave correctly under both scenarios.
For your 32 bit versions you are less likely to run into memory problems anyway because 32 bit systems tend to run on older hardware with fewer processors. In turn this means less demand on memory space because your thread pool tends to be smaller.
If you encounter machines with large enough processor counts to cause out of memory problems then one very simple and pragmatic approach is to give the user a mechanism to limit the number of threads used by your application's thread pool.
Since there are more processes running on the target system even if the necessary infrastructure would be available, if would be no use.
There is no guarantee that another process does not allocate the memory after you have checked its availability and before you actually allocate it. The right thing to do is writing code that will fail gracefully and catch the EOutOfMemory exception when it appears. Use it as a sign to stop creating more threads until some of them is already terminated.
Delphi is 32bit, so you can't allocate memory addresses larger than that.
Take a look at this:
What is a safe Maximum Stack Size or How to measure use of stack?
I'm developing a logger/sniffer using Delphi. During operation I get hugh amounts of data, that can accumulate during stress operations to around 3 GB of data.
On certain computers when we get to those levels the application stops functioning and sometimes throws exceptions.
Currently I'm using GetMem function to allocate the pointer to each message.
Is there a better way to allocate the memory so I could minimize the chances for failure? Keep in mind that I can't limit the size to a hard limit.
What do you think about using HeapAlloc, VirtualAlloc or maybe even mapped files? Which would be better?
Thank you.
Your fundamental problem is the hard address space limit of 4GB for 32 bit processes. Since you are hitting problems at 3GB I can only presume that you are using /LARGEADDRESSAWARE running on 64 bit Windows or 32 bit Windows with the /3GB boot switch.
I think you have a few options, including but not limited to the following:
Use less memory. Perhaps you can process in smaller chunks or push some of the memory to disk.
Use 64 bit Delphi (just released) or FreePascal. This relieves you of the address space constraint but constrains you to 64 bit versions of Windows.
Use memory mapped files. On a machine with a lot of memory this is a way of getting access to the OS memory cache. Memory mapped files are not for the faint hearted.
I can't advise definitively on a solution since I don't know your architecture but in my experience, reducing your memory footprint is often the best solution.
Using a different allocator is likely to make little difference. Yes it is true that there are low-fragmentation allocators but they surely won't really solve your problem. All they could do would be make it slightly less likely to arise.
I have seen recently a strange behavior in my program. After creating large amounts of objects (500MB of RAM) then releasing them, the program's memory footprint does not return to its original size. It still shows a footprint of 160MB (Private working set).
Normal behavior?
Borland's memory manager does not behave like this, so if possible please confirm (or infirm) this is a normal behavior for FastMM: If you have a handy program in which you create a rather complex MDI child (containing several controls/objects), can you create in a loop 250 instances of that MDI child in memory (at the same time) then release them all and check the memory footprint. Please make sure that you consume at least 200-300MB or RAM with those MDI childs.
Especially those that still using Delphi 7 can see the difference by temporary disabling FastMM.
Thanks
If anybody is interested, especially if you want some proof this is not a memory leak (I hope it is not a mem leak in my code - this is also one of the points of this post: to check if it is my fault), here are the original discussions:
My program never releases the memory back. Why?
How to convince the memory manager to release unused memory
Dear Altar, I'm dazzled at how off the point you are in your guesses and how you don't listen to what people told you many times before.
Let's set some things straight. Memory management 101. Please read thoroughly.
When you allocate memory in Delphi, there are two memory managers involved.
System memory manager
First one is a system memory manager. This one is built into Windows and it gives memory in 4kb sized pages.
But it doesn't always give you memory in RAM (or physical memory). Your data can be kept on the hard drive, and read back every time you need to access it. This is awfully slow.
In other words, imagine you have 512Mb of physical memory. You run two programs, each requesting 1Gb of memory. What does OS do?
It grants both requests. Both apps get 1Gb of memory each. Both think all the memory is "in memory". But in fact, only 512Mb can be kept in RAM. The rest is stored in page file, although your app does not know that. It just works slow.
Working set size
Now, what is a "working set size" you are measuring?
It's the part of the allocated memory that is kept in RAM.
If you have an application which allocates 1Gb of memory, and you only have 512 Mb of RAM, then it's working set size will be 512Mb. Although it "uses" 1Gb of memory!
When you run another application which needs memory, OS will automatically free some RAM by moving rarely used blocks of "memory" to the hard drive.
Your virtual memory allocation will stay the same, but more pages will be on the hard drive and less in RAM. Working set size will decrease.
From this, you should have understood by this point, that it's pointless to try and minimize the working set size. You're achieving nothing. You're not freeing memory in any sense. You're just offloading the data to the hard drive.
But the system will do that automatically when it needs to. And there's no point making room in RAM until it's needed. You're just slowing down your application, that's all.
TLDR: "Working set size" is not "how much memory application uses". It's "how much is ready right now". Don't try to minimize it, you're just making things worse.
Delphi memory manager
OS gives you virtual memory in pages of 4Kb. But often you need it in much smaller chunks. For instance, 4 bytes for your integer, or 32 bytes for some structure. The solution?
Application memory manager, such as FastMM or BorlandMM or others.
It's job is to allocate memory in pages from the operating system, then give you small chunks of those pages when you need it.
In other words, when you ask for 14 bytes of memory, this is what happens:
You ask FastMM for 14 bytes of memory.
FastMM asks OS for 1 page of memory (4096 bytes).
OS grants one page of memory, backing it up with RAM (it's stored in actual RAM).
FastMM saves that page, cuts 14 bytes of it and gives to you.
When you ask for another 14 bytes, FastMM just cuts another 14 bytes from the same page.
What happens when you release memory? The same thing backwards:
You release 14 bytes to FastMM. Nothing happens.
You release another 14 bytes. FastMM sees that the 4096 byte page it allocated is now completely unused.
Therefore it releases the page, returning it to the system.
It's worth noting that FastMM cannot release just 14 bytes to the system. It has to release memory in pages. Until the whole page is free, FastMM cannot do a thing. Nobody can.
So, why is my working set size so big, even though I released everything?
First, your working set size is not what you should be measuring. Virtual memory consumption is. But if you have big working set size, your virtual memory consumption will be high too.
What's the problem? You should be able to figure out by this point.
Let's say you allocate 1kb, then 3kb of memory. How much virtual memory have you allocated? 4kb, 1 page.
Now you release 3Kb. How much virtual memory do you use now? 1Kb? No, it's still 1 page. You cannot allocate less than 1 page from the system. You're still using 4096 bytes of virtual memory.
Imagine if you do that 1000 times. 1kb, 3kb, 1kb, 3kb, 1kb, 3kb and so on. You allocate 1000 * 4kb = 4 mb like that, and then you release all the 3kb parts. How much virtual memory do you use now?
Still 4 mb. Because you allocated 1000 pages at first. Of every page you took 1kb and 3kb chunks. Even if you release 3kb chunks, 1kb chunks will continue to keep every single page you allocated in memory. And every page takes 4kb of virtual memory.
Memory manager cannot magically "move" all of your 1kb chunks together. This is impossible, because their virtual addresses can be referenced from somewhere in code. It's not a trait of FastMM.
But why with BorlandMM everything works better?
Coincidence. Maybe it just so happens that BorlandMM gives you memory in a slightly different way than FastMM does. Next thing you know, you change something in your app and BorlandMM acts just like FastMM did. It's impossible for a memory manager to completely prevent this effect, called memory fragmentation.
So what do I do?
Short answer is, not much until this bothers you.
You see, with modern operating systems, you're not really eating anyone's RAM. Per above, OS will automatically swap your pages out when it needs RAM for other applications. This should not be a concern.
And the "excessive" memory isn't lost. Although pages are allocated, 3kb of each is marked as "free". Next time your app needs memory, memory manager will use that space.
But if you really want to help it, you should reorganize your allocations so that the ones you're planning on keeping are done first, and the ones you will soon release are all allocated after that.
Like this: 1kb, 1kb, 1kb, ..., 3kb, 3kb, 3kb...
If you now release all the 3kb chunks, your virtual memory consumption will drop significantly.
This is not always possible. If it's impossible, then just do nothing. It's more or less alright like it is.
And P.S.
You shouldn't be allocating 500 forms in the first place. This is clearly not a way to go. Fix this, and you won't even have a need to think about memory allocation and releasing.
I hope this clears things up, because four posts on the same topic, frankly, is a bit too much.
IIRC, the Delphi memory manager does not immediately return free'd memory to the OS.
Memory is allocated in chunks of small, medium and large sizes, called blocks.
These blocks are kept for a while after their contents have been disposed to have them readyly available when another allocation is requested afterwards.
This limits the amount of system calls required for succesive allocation of multiple objects, and helps avoiding heap fragmentation.
Infirming: Delphi 2007, default memory manager (should be FastMM variation). Several tests on heavy objects:
Initial memory 2Mb, peak memory 30Mb, final memory 4Mb.
Initial memory 2Mb, peak memory 1Gb, final memory 5.5Mb.
What are the heapmanager stats (GetHeapStatus) on the point that 160MB is still allocated?
SOLVED
To confirm that this behavior is generated by FastMM (as suggested by Barry Kelly) I created a second program that allocated A LOT of RAM. As soon as Windows ran out of RAM, my program memory utilization returned to its original value.
Problem solved. Special thanks to Barry Kelly, the only person that pointed to the real "problem".
A friend of mine was asked, during a job interview, to write a program that measures the amount of available RAM. The expected answer was using malloc() in a binary-search manner: allocating larger and larger portions of memory until getting a failure message, reducing the portion size, and summing the amount of allocated memory.
I believe that this method will measure the amount of virtual, not physical, memory. But I got curious about the matter.
Is there a way to tell the amount of available RAM from within the program, without using exec(dmesg |grep -i memory) ?
You are correct: malloc() makes no distinction between physical or virtual memory. In fact, that's the whole point of virtual memory: to make such details irrelevant to programs.
You can find out but it is OS-specific. For example, Linux.
The only way to do this is to use some OS-specific functionality. Using malloc() is useless for a number of reasons:
it measures virtual memory
the OS may well have per-process cap on memory allocations
allocating much more memory than is physically available often degrades the platforms stability to the point where "go back one" algorithm suggested in the question probably won't work
this is OS specific and you should collect such information from the OS services unless you want to make your own memory management layer
Using malloc() will only tell you how much memory can be allocated to a single process. There may be reasons why this is lower than the total amount of virtual memory. For instance, you might have OS quota or a per-process 32-bit-limited address space.
(And, of course, virtual memory >= RAM)
Very OS specific but for Linux the information about system memory is in /proc/meminfo. You can also probably use the sysctl interface (http://www.linuxjournal.com/article/2365) to get this data in a C program.