Avoiding large memory commit size in a Windows process

Avoiding large memory commit size in a Windows process - delphi

I am working on a 32 bit app (VC++ main exe and a couple of Delphi dlls) that slowly leaked memory (over weeks or months). While I fixed the leaks, one problem still remains - while the working set and active working memory set remain relatively low (about 50Mb), the commit size keeps creeping up to about 1.7Gb.
I do realize that Windows (especially Windows 10) can be pretty aggressive committing memory in a process, but still, I see no obvious reason for this to happen.
I patched VirtialAlloc / VirtialAllocEx / VirtialAlloc2 / etc. using Detours to see when VC++ and Delphi memory managers would be forced to request more memory from the OS, but after a few initial requests on startup, I see no calls to those functions while the commit size is still slowly growing.
What else can be responsible for the commit size growth? Any I/O operations? Memory mapped files?

Related

Strategy or tools to find "non-leak" memory usage problems in Delphi?

One old application started to consume memory a lot after server update. Memory usage seems to rise with out limit until program hangs.
According to FastMM4 and EurekaLog, there's no memory leak (except 28 bytes), so I assume all memory is freed when application is shutdown.
Are there any tools or strategies suitable for tracking this kind of memory problem?

Since September 2012, there is a very simple and comfortable way to find this type of "run-time only" memory leaks.
FastMM4991 introduced a new method, LogMemoryManagerStateToFile:
Added the LogMemoryManagerStateToFile call. This call logs a summary of
the memory manager state to file: The total allocated memory, overhead,
efficiency, and a breakdown of allocated memory by class and string type.
This call may be useful to catch objects that do not necessarily leak, but
do linger longer than they should.
To discover the leak at run time, you only need these steps
add a call to LogMemoryManagerStateToFile('memory.log', '') in a place where it will be called in intervals
run the application
open the log file with a tail program (for example BareTail), which will auto-refresh when the file content changes
watch the first lines of the file, they will contain the memory allocations which occupy the highest amount of memory
if you see a class or memory type constantly has a growing number of instances, this can be the reason of your leak

The growing memory consumption is an application issue. It is not a bug, which can discover FastMM4 or EurekaLog. As from they point of view - application just correctly uses the memory.
Using AQTime, MemProof (hard to find, D7 is last supported version (?)), SleuthQA (similar to MemProof) or similar memory profilers, you can track the memory usage outside of application in real-time.
Using FastMM4, GetMemoryManagerState / GetMemoryManagerUsageSummary you can track memory usage from application. Output this information into trace file and analyze it after run. Or make simple wrapping function for one of the above procedures, which will return curent memory usage. And call it from IDE Debugger Evalute / Modify, add to Watches or call OutputDebugString, and see the current memory usage.
Note, if memory is eated by some DLL then you may not see her memory usage using (3). Use (2).
Analyzing the memory usage and the tasks performed by the application, you may discover what leads to raised memory usage.

AQTime (a commercial tool which is quite expensive) can report your memory usage, down to the line of source code that allocated each object. In the case of very large memory usage scenarios, you might want the AQTime functionality that can show the number of objects and the size (total plus individual instance size) for each object. AQTime worked great for me, starting with Delphi 7, and all later versions, including your version (2006) and the latest versions (XE and XE2).
As the program memory usage grows, AQTime can be used to grab "snapshots" of the runtime heap, you can use to understand memory usage of your application; What is being created, and how many of each object exists. Even when no leaks exist, understanding the runtime behaviour of your application in terms of the objects it creates and manages, is very important, and AQTime is the most powerful tool I know of for Delphi users.
If you are willing to upgrade to Delphi XE/XE2, you might have an included light version of AQTime already, if so, check it out. If not, I recommend you try their demo. I am unaware of any free or open source alternatives that can provide the same functionality.
Lesser functionality could be cobbled together manually by writing lots of trace messages, or using the FastMM full-debug-mode. If you could write a complete dump of your memory usage into a very large file, you might be able to write some tools to parse, and create a summary. The problem I have with FastMM in this case, is that you will be drowned in detail information, without the ability to extract exactly the summary information that helps you understand your situation. So, you can try to write your own tool to summarize the memory usage. In one application I had that used a series of components that I knew would use a lot of memory, I wrote a dialog box into my application that showed current memory usage by these large memory-blob-of-data objects.

Have you ever think about the Leak that is causing the IDE... it is so huge!!!
In my case (2GB of RAM) i do the next...
1. Open the IDE
2. Leave it minimized for near six hours
3. See how Physical memory is getting used
The result:
While IDE is oppened (remember i also do the test having it minimized) it is getting more and more RAM... till no more ram free.
It gets all 2GB RAM + all Pagefile hard disk space (i have it configured to a mas of 4GB)
In less that six hours (doing nothing on IDE) it tries to use more than 6GB.
That is called a Memory Leak casused by the IDE... i do not type any letter on IDE, do not compile anything, do not even open any project... just open IDE and minimize it... leave the computer without doing anything on it for about six hours and IDE is consuming 6GB of memory.
Of course, after that, the IDE start with annoying messages of SystemOutOfMemory... and i must kill it... then all that 6GB are freed!!!
When on the hell will this get fixed?
Please note i have all patches applied, i also tested without applying each patch/hotfix, etc...
The best i got was dissabling some options on Tools, like the one that underlines bad code, etc... so why on the hell that option has any influence... i am not typing anything on the IDE (on the tests)... and if i have it dissabled the memory leak gets reduced a lot...
Of course, if i use the IDE (write code on an opened project) without even compiling / running it... the thing goes much more worst... memory leak upto 6GB can got reached on less than an hour, sometimes occurs after 15 minutes of Copy/Paste source code.
Seems there will not be a solution in a short time!!!
So i got the next solution that works perfect:
-Close the IDE an reopen it each 15 minutes or less
Ugly solution, i know... but works!!!

How to use AQTime's memory allocation profiler in a program that uses a large amount of memory?

I'm finding AQTime hard to use because it interferes with the original program too much. If I have a program that uses, for example, 300MB of ram I can use AQTime's allocation profiler without a problem, and find out where most of the memory is being used. However I notice that running under AQTime, the original program uses more like 1GB while it's being profiled.
Right now I'm trying to reduce memory usage in a program which is using 1.4GB of memory. If I run it under AQTime, then the original program uses all of the 2GB address space and crashes. I can of course invent a smaller set of test data and estimate how the memory usage will scale with the full data set - but the reason I'm using a profiler in the first place is to try to avoid this sort of guesswork.
I already have AQTime set to 'Collect stack information - None' and all the check boxes to do with checking memory integrity are switched off, and I've tried restricting the area being profiled to just a few classes but this doesn't seem to improve anything. Is there a way to use AQTime that produces a smaller overhead? Or failing that, what other approaches are there to get a good idea of the memory being used?
The app is written in Delphi 2010 and I'm using AQTime 6.
NB: On top of the increased memory usage, running under AQTime slows the app down an awful lot, making the whole exercise not just impossible but impractical too :-P

AFAIK the allocation profiler will track memory block allocation regardless of profiling areas. Profiling areas are used to track classes instantiation. Of course memory-profiling an application that allocates a large amount of memory is a issue, you may try to use the LARGE_ADRESS_AWARE flag, and the /3GB boot switch, or use a 64 bit system (as long as you have at least 4GB of memory, or more). Also you can take snapshot of the application state before it crashes, to see where the memory is allocated. Profiling takes time, anyway, you may have to let it run for a while.

How much memory your program takes? (FastMM vs Borland MM)

I have seen recently a strange behavior in my program. After creating large amounts of objects (500MB of RAM) then releasing them, the program's memory footprint does not return to its original size. It still shows a footprint of 160MB (Private working set).
Normal behavior?
Borland's memory manager does not behave like this, so if possible please confirm (or infirm) this is a normal behavior for FastMM: If you have a handy program in which you create a rather complex MDI child (containing several controls/objects), can you create in a loop 250 instances of that MDI child in memory (at the same time) then release them all and check the memory footprint. Please make sure that you consume at least 200-300MB or RAM with those MDI childs.
Especially those that still using Delphi 7 can see the difference by temporary disabling FastMM.
Thanks
If anybody is interested, especially if you want some proof this is not a memory leak (I hope it is not a mem leak in my code - this is also one of the points of this post: to check if it is my fault), here are the original discussions:
My program never releases the memory back. Why?
How to convince the memory manager to release unused memory

Dear Altar, I'm dazzled at how off the point you are in your guesses and how you don't listen to what people told you many times before.
Let's set some things straight. Memory management 101. Please read thoroughly.
When you allocate memory in Delphi, there are two memory managers involved.
System memory manager
First one is a system memory manager. This one is built into Windows and it gives memory in 4kb sized pages.
But it doesn't always give you memory in RAM (or physical memory). Your data can be kept on the hard drive, and read back every time you need to access it. This is awfully slow.
In other words, imagine you have 512Mb of physical memory. You run two programs, each requesting 1Gb of memory. What does OS do?
It grants both requests. Both apps get 1Gb of memory each. Both think all the memory is "in memory". But in fact, only 512Mb can be kept in RAM. The rest is stored in page file, although your app does not know that. It just works slow.
Working set size
Now, what is a "working set size" you are measuring?
It's the part of the allocated memory that is kept in RAM.
If you have an application which allocates 1Gb of memory, and you only have 512 Mb of RAM, then it's working set size will be 512Mb. Although it "uses" 1Gb of memory!
When you run another application which needs memory, OS will automatically free some RAM by moving rarely used blocks of "memory" to the hard drive.
Your virtual memory allocation will stay the same, but more pages will be on the hard drive and less in RAM. Working set size will decrease.
From this, you should have understood by this point, that it's pointless to try and minimize the working set size. You're achieving nothing. You're not freeing memory in any sense. You're just offloading the data to the hard drive.
But the system will do that automatically when it needs to. And there's no point making room in RAM until it's needed. You're just slowing down your application, that's all.
TLDR: "Working set size" is not "how much memory application uses". It's "how much is ready right now". Don't try to minimize it, you're just making things worse.
Delphi memory manager
OS gives you virtual memory in pages of 4Kb. But often you need it in much smaller chunks. For instance, 4 bytes for your integer, or 32 bytes for some structure. The solution?
Application memory manager, such as FastMM or BorlandMM or others.
It's job is to allocate memory in pages from the operating system, then give you small chunks of those pages when you need it.
In other words, when you ask for 14 bytes of memory, this is what happens:
You ask FastMM for 14 bytes of memory.
FastMM asks OS for 1 page of memory (4096 bytes).
OS grants one page of memory, backing it up with RAM (it's stored in actual RAM).
FastMM saves that page, cuts 14 bytes of it and gives to you.
When you ask for another 14 bytes, FastMM just cuts another 14 bytes from the same page.
What happens when you release memory? The same thing backwards:
You release 14 bytes to FastMM. Nothing happens.
You release another 14 bytes. FastMM sees that the 4096 byte page it allocated is now completely unused.
Therefore it releases the page, returning it to the system.
It's worth noting that FastMM cannot release just 14 bytes to the system. It has to release memory in pages. Until the whole page is free, FastMM cannot do a thing. Nobody can.
So, why is my working set size so big, even though I released everything?
First, your working set size is not what you should be measuring. Virtual memory consumption is. But if you have big working set size, your virtual memory consumption will be high too.
What's the problem? You should be able to figure out by this point.
Let's say you allocate 1kb, then 3kb of memory. How much virtual memory have you allocated? 4kb, 1 page.
Now you release 3Kb. How much virtual memory do you use now? 1Kb? No, it's still 1 page. You cannot allocate less than 1 page from the system. You're still using 4096 bytes of virtual memory.
Imagine if you do that 1000 times. 1kb, 3kb, 1kb, 3kb, 1kb, 3kb and so on. You allocate 1000 * 4kb = 4 mb like that, and then you release all the 3kb parts. How much virtual memory do you use now?
Still 4 mb. Because you allocated 1000 pages at first. Of every page you took 1kb and 3kb chunks. Even if you release 3kb chunks, 1kb chunks will continue to keep every single page you allocated in memory. And every page takes 4kb of virtual memory.
Memory manager cannot magically "move" all of your 1kb chunks together. This is impossible, because their virtual addresses can be referenced from somewhere in code. It's not a trait of FastMM.
But why with BorlandMM everything works better?
Coincidence. Maybe it just so happens that BorlandMM gives you memory in a slightly different way than FastMM does. Next thing you know, you change something in your app and BorlandMM acts just like FastMM did. It's impossible for a memory manager to completely prevent this effect, called memory fragmentation.
So what do I do?
Short answer is, not much until this bothers you.
You see, with modern operating systems, you're not really eating anyone's RAM. Per above, OS will automatically swap your pages out when it needs RAM for other applications. This should not be a concern.
And the "excessive" memory isn't lost. Although pages are allocated, 3kb of each is marked as "free". Next time your app needs memory, memory manager will use that space.
But if you really want to help it, you should reorganize your allocations so that the ones you're planning on keeping are done first, and the ones you will soon release are all allocated after that.
Like this: 1kb, 1kb, 1kb, ..., 3kb, 3kb, 3kb...
If you now release all the 3kb chunks, your virtual memory consumption will drop significantly.
This is not always possible. If it's impossible, then just do nothing. It's more or less alright like it is.
And P.S.
You shouldn't be allocating 500 forms in the first place. This is clearly not a way to go. Fix this, and you won't even have a need to think about memory allocation and releasing.
I hope this clears things up, because four posts on the same topic, frankly, is a bit too much.

IIRC, the Delphi memory manager does not immediately return free'd memory to the OS.
Memory is allocated in chunks of small, medium and large sizes, called blocks.
These blocks are kept for a while after their contents have been disposed to have them readyly available when another allocation is requested afterwards.
This limits the amount of system calls required for succesive allocation of multiple objects, and helps avoiding heap fragmentation.

Infirming: Delphi 2007, default memory manager (should be FastMM variation). Several tests on heavy objects:
Initial memory 2Mb, peak memory 30Mb, final memory 4Mb.
Initial memory 2Mb, peak memory 1Gb, final memory 5.5Mb.

What are the heapmanager stats (GetHeapStatus) on the point that 160MB is still allocated?

SOLVED
To confirm that this behavior is generated by FastMM (as suggested by Barry Kelly) I created a second program that allocated A LOT of RAM. As soon as Windows ran out of RAM, my program memory utilization returned to its original value.
Problem solved. Special thanks to Barry Kelly, the only person that pointed to the real "problem".

40 million page faults. How to fix this?

I have an application that loads 170 files (let’s say they are text files) from disk in individual objects and kept in memory all the time. The memory is allocated once when I load those files from disk. So, there is no memory fragmentation involved. I also use FastMM to make sure my applications never leaks memory.
The application compares all these files with each other to find similarities. Over-simplified we can say that we compare text strings but the algorithm is way more complex as I have to allow some differences between strings. Each file is about 300KB. Loaded in memory (the object that holds it) it takes about 0.4MB of RAM. So, the running app takes about 60MB or RAM (working set). It processes the data for about 15 minutes. The thing is that it generates over 40 million page faults.
Why? I have about 2GB of free RAM. From what I know Page Faults are slow. How much they are slowing down my program?
How can I optimize the program to reduce these page faults? I guess it has something to do with data locality. Does anybody know some example algorithms for this (Delphi)?
Update:
But looking at the number of page faults (no other application in Task Manager comes close to mine, not even by far) I guess that I could increase the speed of my application IF I manage to optimize memory layout (reduce the page faults).
Delphi 7, Win 7 32 bit, RAM 4GB (3GB visible, 2GB free).

Caveat - I'm only addressing the page faulting issue.
I cannot be sure but have you considered using Memory Mapped files? In this way windows will use the files themselves as the paging file (rather than the main paging file pagrefile.sys). If the files are read only then the number of page faults should theoretically decrease as the pages won't need to written out to disk via the paging file as windows will just load the data from the file itself as needed.
Now to reduce files from paging in and out you need to try and go through the data in one direction so that as new data is read, older pages can be discarded for ever. Here is where you trade off going over the files again and caching data - the cache has to be stored somewhere.
Note that Memory Mapped files is how windows loads .dlls and .exes amongst other things. I've used them to scan though gigabyte files without hitting memory limits (we had MBs in those days and not GBs of ram).
However from the data you describe I'd suggest the ability to not go back ovver files will reduce the amount of repaging going on.

On my machine most pagefaults are reported for developer studio which is reported to have 4M page faults after 30+ minutes total CPU time. You get 10 times more, in half the time. And memory is scarce on my system. So 40M faults seems like a lot.
It could just maybe be you have a memory leak.
the working set is only the physical memory in use for your application. If you leak memory, and don't touch it, it will get paged out. You will see the virtual memory useage (or page file use) increase. These pages might be swapped back in when the heap memory walks the heap, to get swapped out again by windows.
Because you have a lot of RAM, the swapped out pages will stay in physical memory, as nobody else needs them. (a page recovered from RAM counts as a soft fault, from disk as a hard one)

Do you use an exponential resize system ?
If you grow the block of memory in too small increments while loading, it might constantly request large blocks from the system, copy the data over, and then release the old block (assuming that fastmm (de)allocates very large blocks directly from the OS).
Maybe somehow this causes a loop where the OS releases memory from your app's process, and then adds it again, causing page faults on first write.
Also avoid Tstringlist.load* methods for very large files, IIRC these consume twice the space needed.

What will happen if a application is large enough to be loaded into the available RAM memory?

There is chance were a heavy weight application that needs to be launched in a low configuration system.. (Especially when the system has too less memory)
Also when we have already opened lot of application in the system & we keep on trying opening new new application what would happen?
I have only seen applications taking time to process or hangs up for sometime when I try operating with it in low config. system with low memory and old processors..
How it is able to accomodate many applications when the memory is low..? (like 128 MB or lesser..)
Does it involves any paging or something else..?
Can someone please let me know the theory behind this..!

"Heavyweight" is a very vague term. When the OS loads your program, the EXE is mapped in your address space, but only the code pages that run (or data pages that are referenced) are paged in as necessary.
You will likely get horrible performance if pages need to constantly be swapped as the program runs (aka many hard page faults), but it should work.
Since your commit charge is near the commit limit, and the commit limit will likely have no room to grow, you will also likely recieve many malloc()/VirtualAlloc(..., MEM_COMMIT)/HeapAlloc()/{Local|Global}Alloc() failures so you need to watch the return codes in your program.

Some keywords for search engines are: paging, swapping, virtual memory.
Wikipedia has an article called Paging (Redirected from Swap space).

There is often the use of virtual memory. Virtual memory pages are mapped to physical memory if they are used. If a physical page is needed and no page is available, another is written to disk. This is called swapping and that explains why crowded systems get slow and memory upgrades have positive effects on performance.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart