My Delphi6 program crashes because the TotalAddrSpace (THeapStatus) at some point hits the 2GB level, upon which it crashes. I have been able to increase the limit the 4GB level (using {$SetPEFlags $20}), but that only delays the eventual crash.
The problem is that the TotalUncommitted memory keeps increasing for some reason, while the TotalCommitted memory and also the TotalAllocated memory nicely stabilize at an acceptable level (about 550 MB).
I cannot quite figure out WHY the TotalUncommitted memory keeps increasing and eventually makes the TotalAddrSpace hit the 2GB (now: 4GB) level and the program crashes.
In the program I use many dynamic arrays, whose length I increase or reduce regularly with a simple adjustment via the SetLength command. Does this regular increasing/decreasing of dynamic arrays in this way effectively lead to an increase-beyond-bounds of the TotalUncommitted memory?
Any advice or insight is very much appreciated.
Also if you know of a general mechanism to somehow actively decrease the TotalUncommitted memory ...
Thanks for all help!!
My problem turned out to be one of Heap Fragmentation (or my understanding of it).
I used Setlength to increase/decrease, upon need, a dynamic array always in steps of 5 positions. Given the size of each array element, this apparently lead the OS to reserve much more memory than actually needed, which made the Heap.TotalUncommited grow without bounds, as did the Heap. TotalAddrSpace.
I tried different step sizes to see the impact. With a somewhat bigger step size the problem vanished.
I most strongly suggest you run a special build that has MemCheck included. It's a really great tool to detect memory leaks in your application. More modern Delphi versions have some of this built in (in part thanks to FastMem), but this one has been around since the first Delphi versions and works great on versions 5,6,7.
Related
Note: 32 bit application, which is not planned to be migrated to 64 bit.
I'm working with a very memory consuming application and have pretty much optimized all the relevant paths in respect to memory allocation/de-allocation. (there are no memory leaks, no handle leaks, no any other kind of leaks in the application itself AFAIK and tested. 3rd party libs which I cannot touch are of course candidates but unlikely in my scenario)
The application will frequently allocate large single and bi-dimensional dynamic arrays of single and packed records of up to 4 singles. By large I mean 5000x5000 of record(single,single,single,single) is normal. Also having even 6 or 7 such arrays in work at a given time. This is needed as there are a lot of cross-computations made on these arrays and having them read from disk would be a real performance killer.
Having this clarified, I am getting out of memory errors a lot because of these large dynamic arrays which will not go away after releasing them, no matter if I setlength them to 0 or finalize them. This is of course something FastMM is doing in order to be fast, I know that much.
I am tracking both FastMM allocated blocks and process consumed memory (RAM + PF) by using:
function CurrentProcessMemory(AWaitForConsistentRead:boolean): Cardinal;
var
MemCounters: TProcessMemoryCounters;
LastRead:Cardinal;
maxCnt:integer;
begin
result := 0;// stupid D2010 compiler warning
maxCnt := 0;
repeat
Inc(maxCnt);
// this is a stabilization loop;
// in tight loops, the system doesn't get
// much chance to release allocated resources, which in turn will get falsely
// reported by this function as still being used, resulting in a false-positive
// memory leak report in the application.
// so we do a tight loop here, waiting, until the application reported memory
// gets stable.
LastRead := result;
MemCounters.cb := SizeOf(MemCounters);
if GetProcessMemoryInfo(GetCurrentProcess,
#MemCounters,
SizeOf(MemCounters)) then
Result := MemCounters.WorkingSetSize + MemCounters.PagefileUsage
else
RaiseLastOSError;
if AWaitForConsistentRead and (LastRead <> 0) and (abs(LastRead - result)>1024) then
begin
sleep(60);
application.processmessages;
end;
until (not AWaitForConsistentRead) or (abs(LastRead - result)<1024) or (maxCnt>1000);
// 60 seconds wait is a bit too much
// so if the system is that "unstable", let's just forget it.
end;
function CurrentFastMMMemory:Cardinal;
var mem:TMemoryManagerUsageSummary;
begin
GetMemoryManagerUsageSummary(mem);
result := mem.AllocatedBytes + mem.OverheadBytes;
end;
I am running the code on a 64bit computer and my top memory consumption before crashes is about 3.3 - 3.4 GB. After that, I get memory/resources related crashes anywhere in the application. Took me some time to pin it down on the large dynamic arrays usage which were buried down in some 3rd party library.
The way I am getting over this is that I made the application resume itself from where it left off, by re-starting itself and closing with certain parameters.
This is all nice and dandy if memory consumption is fair and current operation finishes.
The big problem happens when the current memory usage is 1GB and the next operation to process requires 2.5 GB memory or more to be processed. My current code limited itself to an upper value of 1.5 GB used memory before resuming, but in this situation, I'd have to drop the limit down under 1 GB which would basically have the application resume itself after each operation and not even that guaranteeing that everything will be fine.
What if another operation will have a larger data set to process and it will require a total of 4GB or more memory?
To note that I am not talking about actual 4 GB in memory, but consumed memory by allocating huge dynamic arrays which the OS doesn't get back once de-allocated and hence it still sees it as consumed, so it adds up.
So, my next point of attack is to force fastmm to release all (or at least part of) memory to the OS. I'm specifically targeting the huge dynamic arrays here. Again, these are in a 3rd party library so re-coding that is not really in the top options. It's much easier and faster to tinker in the fastmm code and write a proc to release the memory.
I can't switch from FastMM as currently the entire application and some of the 3rd party libs are heavily coded around the use of PushAllocationGroup in order to quickly find and pinpoint any memory leaks. I know I can write a dummy FastMM unit to solve the compilation references, but I will be left without this quick and certain leak detection.
In conclusion: is there any way I can force FastMM to release at least some of it's large blocks to the OS? (well, sure there is, the actual question is: did anybody write it and if so, mind sharing?)
Thanks
later edit:
I will come up with a small relevant test application soon. It doesn't appear to be that easy to mock up one
I doubt that the issue is actually down to FastMM. For huge memory blocks, FastMM will not do any sub-allocation. Your allocation request will be handled with a straight VirtualAlloc. And then deallocation is VirtualFree.
That's assuming that you are allocating those 380MB objects in one contiguous block. I suspect that what you actually have are ragged 2D dynamic arrays. And they are not single allocations. a 5000x5000 ragged 2D dynamic arrays takes 5001 allocations to initialise. One for the row pointers, and 5000 for the rows. Those will be medium FastMM blocks. There will be sub-allocation.
I think you are asking too much. In my experience, any time you need over 3GB of memory in a 32 bit process, it's game over. Fragmentation of address space will stop you before you run out of memory. You cannot hope for this to work. Switch to 64 bit, or use a cleverer, less demanding allocation pattern. Or do you really need dense 2D arrays? Can you use sparse storage?
If you cannot alleviate your memory demands that way, you could use memory mapped files. This would allow you to make use of the extra memory that your 64 bit system has. The system's disk cache can be larger than 4GB and so your app can traverse more than 4GB of memory without actually needing to hit the disk.
You could certainly try different memory managers. I honestly do not hold out any hope that it would help. You could write a trivial replacement memory manager that used HeapAlloc. And enable the low fragmentation heap (enabled by default from Vista on). But I sincerely doubt that it will help. I'm afraid that there won't be a quick fix for you. To resolve this you face a more fundamental modification to your code.
Your issue as others have said is most likely attributable to memory fragmentation. You could test this by using VirtualQuery to create a picture of how memory is allocated to your application. You will very likely find that although you may have more than enough total memory for a new array, you don't have enough contiguous memory.
FastMem already does a lot to try and avoid problems due to memory fragmentation. "Small" allocations are done at the low end of the address space, whereas "large" allocations are done at the high end. This avoids a common problem where a series of large then small allocations followed by all large allocations being released results in a large amount of fragmented memory that is almost unusable. (Certainly unusable by anything slightly larger than the original large allocations.)
To see the benfits of FastMem's approach, imagine your memory layed out as follows:
Each digit represent a 100mb block.
[0123456789012345678901234567890123456789]
Small allocations represented by "s".
Large allocations repestented by capital letters.
[0sssss678901GGGGFFFFEEEEDDDDCCCCBBBBAAAA]
Now if you free all your large blocks, you should have no trouble performing similar large allocations later.
[0sssss6789012345678901234567890123456789]
The problem is that "large" and "small" are relative, and highly dependent on the nature of your application. FastMem defines a dividing line between "large" and "small". If you happen to have some small allocations that FastMem would classify as large, you may encounter the following problem.
[0sss4sGGGGsFFFFsEEEEsDDDDsCCCCsBBBBsAAAA]
Now if you free the large blocks you're left with:
[0sss4s6789s1234s6789s1234s6789s1234s6789]
And an attempt to allocate something larger than 400mb will fail.
Options
You may be able to tweak the FastMem settings so that all your "small" allocations are also considered small by FastMem. However, there are a few situations where this won't work:
Any DLLs you use that allocate memory to your application but bypass FastMem may still cause fragmentation.
If you don't release all your large blocks together, those that remain may induce fragmentation which will slowly get worse over time.
You could take on the task of memory management yourself.
Allocate one very large block e.g. 3.5GB which you keep for the entire lifetime of the application.
Instead of using dynamic arrays, you determine the pointer locations to use when setting up a new array.
Of course the simplest alternative would be to go 64-bit.
You could consider alternate data structures.
Do you really need array lookup capability? If not, another structure that allocates in smaller chunks may suffice.
Even if you do need array lookup, consider a paged array. Sparse arrays are a combination of arrays and linked lists. Data is stored on pages, with linked lists chaining each page.
A simple variant (since you mentioned your arrays are 2 dimensional) would be to leverage that: One dimension forms its own array providing a lookup into one of multiple arrays for the second dimension.
Related to the alternate data structures option, consider storing some data on disk. Yes performance will be slower. But if an efficient caching mechanism can be found, then maybe not so much. It would be better to be a little slower, but not crashing.
Dynamic arrays are reference counted in Delphi, so they should be automatic released when they are not used anymore.
Like strings, they are handled with COW (copy on write) when shared/stored in several variables/objects. So it seems you have some kind of memory/reference leak (e.g. an object in memory that holds still are reference to an array).
Just to be sure: you are not doing any kind of low level pointer tricks, aren't you?
So please yes, post a test program (or send the complete program private via email) so one of us can take a look at it.
This is a follow-up to this question: What could explain the difference in memory usage reported by FastMM or GetProcessMemoryInfo?
My Delphi XE application is using a very large amount of memory which sometimes lead to an out of memory exception. I'm trying to understand why and what is causing this memory usage and while FastMM is reporting low memory usage, when requesting for TProcessMemoryCounters.PageFileUsage I can clearly see that a lot of memory is used by the application.
I would like to understand what is causing this problem and would like some advise on how to handle it:
Is there a way to know what is contained in that memory and where it has been allocated ?
Is there some tool to track down memory usage by line/procedure in a Delphi application ?
Any general advise on how to handle such a problem ?
EDIT 1 : Here are two screenshots of FastMMUsageTracker indicating that memory has been allocate by the system.
Before process starts:
After process ends:
Legend: Light red is FastMM allocated and dark gray is system allocated.
I'd like to understand what is causing the system to use that much memory. Probably by understanding what is contained in that memory or what line of code or procedure did cause that allocation.
EDIT 2 : I'd rather not use the full version of AQTime for multiple reasons:
I'm using multiple virtual machines for development and their licensing system is a PITA (I'm already a registered user of TestComplete)
LITE version doesn't provide enough information and I won't waste money without making certain the FULL version will give me valuable information
Any other suggestions ?
Another problem might be heap fragmentation. This means you have enough memory free, but all the free blocks are to small. You might see it visually by using the source version of FastMM and use the FastMMUsageTracker.pas as suggested here.
You need a profiler, but even that won't be enough in lots of places and cases. Also, in your case, you would need the full featured AQTime, not the lite version that comes with Delphi XE and XE2. (AQTIME is extremely expensive, and annoyingly node-locked, so don't think I'm a shill for SmartBear software.)
The thing is that people often mistake AQTime Allocation Profiler as only a way to find leaks. It can also tell you where your memory goes, at least within the limits of the tool. While running, and consuming lots of memory, I click Run -> Get Results.
Here is one of my applications being profile in AQTime with its Allocation Profiler showing exactly what class is allocating how many instances on the heap and how much memory those use. Since you report low Delphi heap usage with FastMM, that tells me that most of AQTime's ability to analyze by delphi class name will also be useless to you. However by using AQTime's events and triggers, you might be able to figure out what areas of your application are causing you a "memory usage expense" and when those occur, what the expense is. AQTime's real-time instrumentation may be sufficient to help you narrow down the cause even though it might not find for you what function call is causing the most memory usage automatically.
The column names include "Object Name" which includes things like this:
* All delphi classes, and their instance count and heap usage.
* Virtual Memory blocks allocated via Win32 calls.
It can detect Delphi and C/C++ library allocations on the heap, and can see certain Windows-API level memory allocations.
Note the live count of objects, the amount of memory from the heap that is used.
I usually try to figure out the memory cost of a particular operation by measuring heap memory use before, and just after, some expensive operation, but before the cleanup (freeing) of the memory from that expensive operation. I can set event points inside AQTime and when a particular method gets hit or a flag gets turned on by me, I can measure before, and after values, and then compare them.
FastMM alone can not even detect a non-delphi allocation or an allocation from a heap that is not being managed by FastMM. AQTime is not limited in that way.
I am trying to work through some low memory conditions using instruments. I can watch memory consumption in the Physical Memory Free monitor drop down to a couple of MB, even though Allocations shows that All Allocations is about 3 MB and Overall Bytes is 34 MB.
I have started to experience crashing since I moved some operations to a separate thread with an NSOperationQueue. But I wasn't using instruments before the change. Nevertheless, I'm betting I did something that I can undo to stop the crashes.
By the way, it is much more stable without instruments or the debugger connected.
I have the leaks down to almost none (maybe a hundred bytes max before a crash).
When I look at Allocations, I only see very primitive objects. And the total memory reported by it is also very low. So I cant see how my app is causing these low memory warnings.
When I look at Heap Shots from the start up, I don't see more than about 3 MB there, between the baseline and the sum of all the heap growth values.
What should I be looking at to find where the problem is? Can I isolate it to one of my view controller instances, for example? Or to one of my other instances?
What I have done:
I powered the device off and back on, and this made a significant improvement. Instruments is not reporting a low memory warning. Also, I noticed that Physical Free Memory at start up was only about 7 MB before restarting, and its about 60 MB after restarting.
However, I am seeing a very regular (periodic) drop in Physical Free Memory, dropping from 43 MB to 6 MB (an then back up to 43 MB). I would like to knwo what it causing that. I don't have any timers running in this app. (I do have some performSelector:afterDelay:, but those aren't active during these tests.)
I am not using ARC.
The allocations and the leaks instruments only show what the objects actually take, but not what their underlaying non-object structures (the backing stores) are taking. For example, for UIImages it will show you have a few allocated bytes. This is because a UIImage object only takes those bytes, but the CGImageRef that actually contains the image data is not an object, and it is not taken into account in these instruments.
If you are not doing it already, try running the VM Tracker at the same time you run the allocations instrument. It will give you an idea of the type memory that is being allocated. For iOS the "Dirty Memory", shown by this instrument, is what normally triggers the memory warnings. Dirty memory is memory that cannot be automatically discarded by the VM system. If you see lots of CGImages, images might be your problem.
Another important concept is abandoned memory. This is memory that was allocated, it is still referenced somewhere (and as such not a leak), but not used. An example of this type of memory is a cache of some sort, which is not freeing up upon memory warning. A way to find this out is to use the heap shot analysis. Press the "Mark Heap" button of the allocations instrument, do some operation, return to the previous point in the app and press "Mark Heap" again. The second heap shot should show you what new objects have been allocated between those two moments, and might shed some light on the mystery. You could also repeat the operation simulating a memory warning to see if that behaviour changes.
Finally, I recommend you to read this article, which explains how all this works: http://liam.flookes.com/wp/2012/05/03/finding-ios-memory/.
The difference between physical memory from VM Tracker and allocated memory from "Allocations" is due to the major differences of how these instruments work:
Allocations traces what your app does by installing a tap in the functions that allocate memory (malloc, NSAllocateObject, ...). This method yields very precise information about each allocation, like position in code (stack), amount, time, type. The downside is that if you don't trace every function (like vm_allocate) that somehow allocates memory, you lose this information.
VM Tracker samples the state of the system's virtual memory in regular intervals. This is a much less precise method, as it just gives you an overall view of the current state. It operates at a low frequency (usually something like every three seconds) and you get no idea of how this state was reached.
A known culprit of invisible allocations is CoreGraphics: It uses a lot of memory when decompressing images, drawing bitmap contexts and the like. This memory is usually invisible in the Allocations instrument. So if your app handles a lot of images it is likely that you see a big difference between the amount of physical memory and the overall allocated size.
Spikes in physical memory might result from big images being decompressed, downsized and then only used in screen resolution in some view's or layer's contents. All this might happen automatically in UIKit without your code being involved.
I have the leaks down to almost none (maybe a hundred bytes max before a crash).
In my experience, also very small leaks are "dangerous" sign. In fact, I have never seen a leak larger than 4K, and leaks I usually see are a couple hundreds of bytes. Still, they usually "hide" behind themselves a much larger memory which is lost.
So, my first suggestion is: get rid of those leaks, even though they seem small and insignificant -- they are not.
I have started to experience crashing since I moved some operations to a separate thread with an NSOperationQueue.
Is there a chance that the operation you moved to the thread is the responsible for the pulsing peak? Could it be spawned more than once at a time?
As to the peaks, I see two ways you can go about them:
use the Time Profiler in Instruments and try to understand what code is executing while you see the peak rising;
selectively comment out portions of your code (I mean: entire parts of your app -- e.g., replace a "real" controller with a basic/empty UIViewController, etc) and see if you can identify the culprit this way.
I have never seen such a pulsating behaviour, so I assume it depends on your app or on your device. Have you tried with a different device? What happens in the simulator (do you see the peak)?
When I'm reading your text, I have the impression that you might have some hidden leaks. I could be wrong but, are you 100% sure that you have check all leaks?
I remember one particular project I was doing few month ago, I had the same kind of issue, and no leaks in Instruments. My memory kept growing up and I get memory warnings... I start to log on some important dealloc method. And I've seen that some objects, subviews (UIView) were "leaking". But they were not seen by Instruments because they were still attached to a main view.
Hope this was helpful.
In the Allocations Instrument make sure you have "Only Track Active Allocations" checked. See Image Below. I think this makes it easier to see what is actually happening.
Have you run Analyze on the project? If there's any analyze warnings, fix them first.
Are you using any CoreFoundation stuff? Some of the CF methods have ... strange ... interactions with the ObjC runtime and mem management (they shouldn't do, AFAICS, but I've seen some odd behaviour with the low-level image and AV manipulations where it seems like mem is being used outside the core app process - maybe the OS calls being used by Apple?)
... NB: there have also, in previous versions of iOS, been a few mem-leaks inside Apple's CF methods. IIRC the last of those was fixed in iOS 5.0.
(StackOVerflow's parser sucks: I typed "3" not "1") Are you doing something with a large number of / large-sized CALayer instances (or UIView's with CG* methods, e.g. a custom drawRect method in a UIView?)
... NB: I have seen the exact behaviour you describe caused by 2 and 3 above, either in the CF libraries, or in the Apple windowing system when it tries to work with image data that was originally generated inside CF libraries - or which found its way into CALayers.
It seems that Instruments DOES NOT CORRECTLY TRACK memory usage inside the CA / CG system; this area is a bit complex since Apple is shuffling back and forth between CPU and GPU ram, but it's disappointing that the mem usage seems to simply "disappear" when it clearly is still being used!
Final thought (4. -- but SO won't let me type that) - are you using the invisible RHS of Instruments?
Apple hardcoded Instruments to always disable itself everytime you run it (so you have to keep manually opening it). This is stupid, since some of the core information only exists in the RHS bar. But I've worked with several people who didn't even know it existed :)
I'm finding AQTime hard to use because it interferes with the original program too much. If I have a program that uses, for example, 300MB of ram I can use AQTime's allocation profiler without a problem, and find out where most of the memory is being used. However I notice that running under AQTime, the original program uses more like 1GB while it's being profiled.
Right now I'm trying to reduce memory usage in a program which is using 1.4GB of memory. If I run it under AQTime, then the original program uses all of the 2GB address space and crashes. I can of course invent a smaller set of test data and estimate how the memory usage will scale with the full data set - but the reason I'm using a profiler in the first place is to try to avoid this sort of guesswork.
I already have AQTime set to 'Collect stack information - None' and all the check boxes to do with checking memory integrity are switched off, and I've tried restricting the area being profiled to just a few classes but this doesn't seem to improve anything. Is there a way to use AQTime that produces a smaller overhead? Or failing that, what other approaches are there to get a good idea of the memory being used?
The app is written in Delphi 2010 and I'm using AQTime 6.
NB: On top of the increased memory usage, running under AQTime slows the app down an awful lot, making the whole exercise not just impossible but impractical too :-P
AFAIK the allocation profiler will track memory block allocation regardless of profiling areas. Profiling areas are used to track classes instantiation. Of course memory-profiling an application that allocates a large amount of memory is a issue, you may try to use the LARGE_ADRESS_AWARE flag, and the /3GB boot switch, or use a 64 bit system (as long as you have at least 4GB of memory, or more). Also you can take snapshot of the application state before it crashes, to see where the memory is allocated. Profiling takes time, anyway, you may have to let it run for a while.
In a previous post ( My program never releases the memory back. Why? ) I show that FastMM can cache (read as hold for itself) pretty large amounts of memory. If your application just loaded a large data set in RAM, after releasing the data, you will see that impressive amounts of RAM are not released back to the memory pool.
I looked around and it seems that calling the SetProcessWorkingSetSize API function will "flush" the cache to disk. However, I cannot decide when to call this function. I wanted to call it at the end of the OnClick event on the button that is performing the RAM intensive operation. However, some people are saying that this may cause AV.
If anybody used this function successfully, please let me (us) know.
Many thanks.
Edit:
1. After releasing the data set, the program still takes large amounts of RAM. After calling SetProcessWorkingSetSize the size returns to few MB. Some argue that nothing was released back. I agree. But the memory foot print is now small AND it is NOT increasing back after using the program normally (for example when performing normal operation that does not involves loading large data sets). Unfortunately, there is no way to demonstrate that the memory swapped to disk is ever loaded back into memory, but I think it is not.
2. I have already demonstrated (I hope) this is not a memory leak:
My program never releases the memory back. Why?
How to convince the memory manager to release unused memory
If SetProcessWorkingSetSize would solve your problem, then your problem is not that FastMM is keeping hold of memory. Since this function will just trim the workingset of your application by writing the memory in RAM to the page file. Nothing is released back to Windows.
In fact you only have made accessing the memory again slower, since it now has to be read from disc. This method has the same effect as minimising your application. Then Windows presumes you are not going to use the application again soon and also writes the workingset in RAM to the pagefile. Windows does a good job of deciding when to write RAM to the pagefile and tries to keep the most used memory in RAM as long as it can. It will make the workinset size smaller (write to pagefile) when there is little RAM left. I would not mess with it just to give the illusion that you program is using less memory while in fact it is using just as much as before, only now it is slower to access. Memory that is accessed again will be loaded into RAM again and make the workinset size grow again. Touching less memory keeps the workingset size smaller.
So no, this will not help you forcing FastMM to release the memory. If your goal is for your application to use less memory you should look elsewhere. Look for leaks, look for heap fragmentations look for optimisations and if you think FastMM is keeping you from doing so you should try to find facts to support it. If your goal is to keep your workinset size small you could try to keep your memory access local. Maybe FastMM or another memory manager could help you with it, but it is a very different problem compared to using to much memory. And maybe this function does help you solve the problem you are having, but I would use it with care and certainly not use it just to keep the illusion that your program has a low memory usage.
I agree with Lars Truijens 100%, if you don't than you can check the FasttMM memory usage via FasttMM calls GetMemoryManagerState and GetMemoryManagerUsageSummary before and after calling API SetProcessWorkingSetSize.
Are you sure there is a problem? Working sets might only decrease when there really is a memory shortage.
Problem solved:
I don't need to use SetProcessWorkingSetSize. FastMM will eventually release the RAM.
To confirm that this behavior is generated by FastMM (as suggested by Barry Kelly) I crated a second program that allocated A LOT of RAM. As soon as Windows ran out of RAM, my program memory utilization returned to its original value.
I used this function just once, when I implemented TWebBrowser. This component took me so much memory even if I destroyed the instance.