Understanding how software accumulates memory in system

Understanding how software accumulates memory in system - memory

Lately I've been trying to understand and track down a nasty memory leak in my software. To do this, I started to monitor memory usage over long periods of time to try to figure out if there is any pattern that would serve as a clue to understand this issue.
In the graphic bellow, the virtual memory is drawn in purple and the % of CPU memory in green, the x-axis represent time in seconds.
There are some big spikes that occur when a video streaming feature is activated, but this doesn't seem to be an issue since the software seems to be able to clear these.
Around second 7500 there is a big drop because of a stand-by feature of the system that was activated for a few seconds. After the system returns to normal, it clears some memory that was accumulated from before.. so far this makes sense. The thing I can't understand is that, if the amount of stored memory decreases, why doesn't the %Mem also decrease? In this case, it is actually increasing.
There is not a clear correlation between %Mem and virtual memory usage. Can anyone help me understand this?

I realized that Virtual Memory is not actually related to %MEM, since it is stored in the hard-drive and swapped to RAM when a process needs it. The memory of a process that is related to the %MEM is the RSS (Resident Set Size), which is the memory that is stored in the RAM.
In the next graphic I monitored RSS instead, and the correlation is obvious.

Related

Apple Instruments slows down app when analyzing memory allocations

When running my app in the simulator and analyzing its memory allocations using Instruments, the App runs very slow, it runs at less than 1/30 of its normal speed.
The app uses about 50 MB RAM and has approximately 900,000 life objects (according to Instruments).
Could this be the reason for the slow performance?
When running in the app on the device or in the simulator without using Instruments, it performs well (except the memory issue I am trying to debug).
Do you have any idea on how to solve this issue?
Did you encounter slow performance using the Memory Allocation
instruments?
Would you consider having more than 900,000 life
objects "concerning"?

Considering your Analyzer performance issue
In your specific case monitoring the app over a long period of time will not be necessary, as you reach the state of high memory consumption very soon. You could simply stop recording at this point. Then you won't have problems navigating through the different views and statistics to find the cause of the memory issue.
Analyzing the memory issue
Slowing down is normal. 1/30 sounds quite alarming.
You probably should track how the amount of life objects and the memory usage change while you use the app.
It is difficult to decide if a certain amount of life objects at a specific point in time is critical (though 900,000 seems very high).
In general: if life objects and memory usage grow continuously and don't shrink, that is a bad sign.
If you take a look Statistics -> Object Summary (Screenshot), Live Bytes should be a lot smaller than Overall Bytes and the amount of #Living objects should be a lot smaller than the amount of #Transitory objects.
The second thing you can look at, is the Call Tree view.
It gives you a nice overview of which parts of the application are responsible for reserving large amount of memory:
Possible solutions
Once you detect the parts of your code that are responsible for reserving the large memory amount you can look for retain-cycles or you could try to use more autorelease pools in that spot.

Check that you have enough available disk space. I had 8gb left and it seems like that was too little. Instruments was extreeemely slow. Used a minute just to start and didn't quite get around at all.
I cleared out more disk space and then it suddenly went back to being fast as before.

How to reserve memory for my application and leave a specified amount remaining?

I'm planning an application which will involve loading many pictures at one time and thus requires a large chunk of memory. For example, I might have 50 image objects created at once, taking a total of 1GB of RAM. But when the user goes to load 20 more pictures, I'd like to make sure that amount of memory is already reserved and ready.
Now this part might seem a little backwards from normal. Rather than specifying how much memory my application shall reserve, instead I need to specify how much memory to leave free for other applications, and adjust my application's memory periodically according to this specification. I must say I've never worked with reserving memory at all, and especially won't know how to leave this remaining available memory.
So for example, if the computer has 2048 MB of RAM, and the option is set to leave 50 MB free for other applications, and there is already 10MB of RAM being used by other apps, then it should reserve 2048-50-10 = 1988 MB for my app.
The trouble I foresee is suppose the user opens another application which requires 1GB. My app has to catch this and shrink its self.
Does this even sound like a feasible approach? Basically, I need to make sure there is as much memory reserved as possible at any given time, while leaving a decent amount available for other apps. Would it make a significant impact on performance if I do this, or not much at all? I might be loading and unloading images at rapid paces, and I don't want it to reserve/free this memory on demand, I want it to stay reserved.

+1 for Sertac's mentioning of how SQL Server rides the line of allocating memory it needs, but releasing memory when Windows complains.
Applications can receive Window's complaints by using the CreateMemoryResourceNotification:
hLowMemory := CreateMemoryResourceNotification(LowMemoryResourceNotification);
Applications can use memory resource notification events to scale the
memory usage as appropriate. If available memory is low, the
application can reduce its working set. If available memory is high,
the application can allocate more memory.
Any thread of the calling
process can specify the memory resource notification handle in a call
to the QueryMemoryResourceNotification function or one of the wait functions.
The state of the object is signaled when the specified
memory condition exists. This is a system-wide event, so all
applications receive notification when the object is signaled. Note
that there is a range of memory availability where neither the
LowMemoryResourceNotification or HighMemoryResourceNotification object
is signaled. In this case, applications should attempt to keep the
memory use constant.
But it's also worth mentioning that you might as well allocate memory that you need. Your operating system has a very sophisiticated set of algorithms to swap out the least used memory when memory pressure is high. You can take advantage of this by simply allocating all the memory that you need. When Windows starts to run low, it will find those pages of memory that you are using the least and swap them out to disk. (This is how a well-known reverse proxy works).
The only thing left is to decide if you want to free some images when Windows says it's running low on RAM. But if you're not using the memory, it is going to be swapped out to disk for you.

It's not realistic to account for other apps. Just ignore them. The system will page things in and out as needed. If you really wanted to do this you'd have to dynamically adapt to other processes as they start and finish. That's really not realistic. What's more it's not practical to inquire of other processes how much memory they need. Leave it all to the system.
Set a budget for your app and make sure you don't exceed it. Keep the most recently used images in memory and when you approach your memory budget throw away the least recently used images to make space.
If you are stressing the available resources then make sure you use FastMM and enable LARGE_ADDRESS_AWARE for your app so that you get 4GB address space when running on a 64 bit OS.

How to use AQTime's memory allocation profiler in a program that uses a large amount of memory?

I'm finding AQTime hard to use because it interferes with the original program too much. If I have a program that uses, for example, 300MB of ram I can use AQTime's allocation profiler without a problem, and find out where most of the memory is being used. However I notice that running under AQTime, the original program uses more like 1GB while it's being profiled.
Right now I'm trying to reduce memory usage in a program which is using 1.4GB of memory. If I run it under AQTime, then the original program uses all of the 2GB address space and crashes. I can of course invent a smaller set of test data and estimate how the memory usage will scale with the full data set - but the reason I'm using a profiler in the first place is to try to avoid this sort of guesswork.
I already have AQTime set to 'Collect stack information - None' and all the check boxes to do with checking memory integrity are switched off, and I've tried restricting the area being profiled to just a few classes but this doesn't seem to improve anything. Is there a way to use AQTime that produces a smaller overhead? Or failing that, what other approaches are there to get a good idea of the memory being used?
The app is written in Delphi 2010 and I'm using AQTime 6.
NB: On top of the increased memory usage, running under AQTime slows the app down an awful lot, making the whole exercise not just impossible but impractical too :-P

AFAIK the allocation profiler will track memory block allocation regardless of profiling areas. Profiling areas are used to track classes instantiation. Of course memory-profiling an application that allocates a large amount of memory is a issue, you may try to use the LARGE_ADRESS_AWARE flag, and the /3GB boot switch, or use a 64 bit system (as long as you have at least 4GB of memory, or more). Also you can take snapshot of the application state before it crashes, to see where the memory is allocated. Profiling takes time, anyway, you may have to let it run for a while.

When to call SetProcessWorkingSetSize? (Convincing the memory manager to release the memory)

In a previous post ( My program never releases the memory back. Why? ) I show that FastMM can cache (read as hold for itself) pretty large amounts of memory. If your application just loaded a large data set in RAM, after releasing the data, you will see that impressive amounts of RAM are not released back to the memory pool.
I looked around and it seems that calling the SetProcessWorkingSetSize API function will "flush" the cache to disk. However, I cannot decide when to call this function. I wanted to call it at the end of the OnClick event on the button that is performing the RAM intensive operation. However, some people are saying that this may cause AV.
If anybody used this function successfully, please let me (us) know.
Many thanks.
Edit:
1. After releasing the data set, the program still takes large amounts of RAM. After calling SetProcessWorkingSetSize the size returns to few MB. Some argue that nothing was released back. I agree. But the memory foot print is now small AND it is NOT increasing back after using the program normally (for example when performing normal operation that does not involves loading large data sets). Unfortunately, there is no way to demonstrate that the memory swapped to disk is ever loaded back into memory, but I think it is not.
2. I have already demonstrated (I hope) this is not a memory leak:
My program never releases the memory back. Why?
How to convince the memory manager to release unused memory

If SetProcessWorkingSetSize would solve your problem, then your problem is not that FastMM is keeping hold of memory. Since this function will just trim the workingset of your application by writing the memory in RAM to the page file. Nothing is released back to Windows.
In fact you only have made accessing the memory again slower, since it now has to be read from disc. This method has the same effect as minimising your application. Then Windows presumes you are not going to use the application again soon and also writes the workingset in RAM to the pagefile. Windows does a good job of deciding when to write RAM to the pagefile and tries to keep the most used memory in RAM as long as it can. It will make the workinset size smaller (write to pagefile) when there is little RAM left. I would not mess with it just to give the illusion that you program is using less memory while in fact it is using just as much as before, only now it is slower to access. Memory that is accessed again will be loaded into RAM again and make the workinset size grow again. Touching less memory keeps the workingset size smaller.
So no, this will not help you forcing FastMM to release the memory. If your goal is for your application to use less memory you should look elsewhere. Look for leaks, look for heap fragmentations look for optimisations and if you think FastMM is keeping you from doing so you should try to find facts to support it. If your goal is to keep your workinset size small you could try to keep your memory access local. Maybe FastMM or another memory manager could help you with it, but it is a very different problem compared to using to much memory. And maybe this function does help you solve the problem you are having, but I would use it with care and certainly not use it just to keep the illusion that your program has a low memory usage.

I agree with Lars Truijens 100%, if you don't than you can check the FasttMM memory usage via FasttMM calls GetMemoryManagerState and GetMemoryManagerUsageSummary before and after calling API SetProcessWorkingSetSize.

Are you sure there is a problem? Working sets might only decrease when there really is a memory shortage.

Problem solved:
I don't need to use SetProcessWorkingSetSize. FastMM will eventually release the RAM.
To confirm that this behavior is generated by FastMM (as suggested by Barry Kelly) I crated a second program that allocated A LOT of RAM. As soon as Windows ran out of RAM, my program memory utilization returned to its original value.

I used this function just once, when I implemented TWebBrowser. This component took me so much memory even if I destroyed the instance.

How much memory your program takes? (FastMM vs Borland MM)

I have seen recently a strange behavior in my program. After creating large amounts of objects (500MB of RAM) then releasing them, the program's memory footprint does not return to its original size. It still shows a footprint of 160MB (Private working set).
Normal behavior?
Borland's memory manager does not behave like this, so if possible please confirm (or infirm) this is a normal behavior for FastMM: If you have a handy program in which you create a rather complex MDI child (containing several controls/objects), can you create in a loop 250 instances of that MDI child in memory (at the same time) then release them all and check the memory footprint. Please make sure that you consume at least 200-300MB or RAM with those MDI childs.
Especially those that still using Delphi 7 can see the difference by temporary disabling FastMM.
Thanks
If anybody is interested, especially if you want some proof this is not a memory leak (I hope it is not a mem leak in my code - this is also one of the points of this post: to check if it is my fault), here are the original discussions:
My program never releases the memory back. Why?
How to convince the memory manager to release unused memory

Dear Altar, I'm dazzled at how off the point you are in your guesses and how you don't listen to what people told you many times before.
Let's set some things straight. Memory management 101. Please read thoroughly.
When you allocate memory in Delphi, there are two memory managers involved.
System memory manager
First one is a system memory manager. This one is built into Windows and it gives memory in 4kb sized pages.
But it doesn't always give you memory in RAM (or physical memory). Your data can be kept on the hard drive, and read back every time you need to access it. This is awfully slow.
In other words, imagine you have 512Mb of physical memory. You run two programs, each requesting 1Gb of memory. What does OS do?
It grants both requests. Both apps get 1Gb of memory each. Both think all the memory is "in memory". But in fact, only 512Mb can be kept in RAM. The rest is stored in page file, although your app does not know that. It just works slow.
Working set size
Now, what is a "working set size" you are measuring?
It's the part of the allocated memory that is kept in RAM.
If you have an application which allocates 1Gb of memory, and you only have 512 Mb of RAM, then it's working set size will be 512Mb. Although it "uses" 1Gb of memory!
When you run another application which needs memory, OS will automatically free some RAM by moving rarely used blocks of "memory" to the hard drive.
Your virtual memory allocation will stay the same, but more pages will be on the hard drive and less in RAM. Working set size will decrease.
From this, you should have understood by this point, that it's pointless to try and minimize the working set size. You're achieving nothing. You're not freeing memory in any sense. You're just offloading the data to the hard drive.
But the system will do that automatically when it needs to. And there's no point making room in RAM until it's needed. You're just slowing down your application, that's all.
TLDR: "Working set size" is not "how much memory application uses". It's "how much is ready right now". Don't try to minimize it, you're just making things worse.
Delphi memory manager
OS gives you virtual memory in pages of 4Kb. But often you need it in much smaller chunks. For instance, 4 bytes for your integer, or 32 bytes for some structure. The solution?
Application memory manager, such as FastMM or BorlandMM or others.
It's job is to allocate memory in pages from the operating system, then give you small chunks of those pages when you need it.
In other words, when you ask for 14 bytes of memory, this is what happens:
You ask FastMM for 14 bytes of memory.
FastMM asks OS for 1 page of memory (4096 bytes).
OS grants one page of memory, backing it up with RAM (it's stored in actual RAM).
FastMM saves that page, cuts 14 bytes of it and gives to you.
When you ask for another 14 bytes, FastMM just cuts another 14 bytes from the same page.
What happens when you release memory? The same thing backwards:
You release 14 bytes to FastMM. Nothing happens.
You release another 14 bytes. FastMM sees that the 4096 byte page it allocated is now completely unused.
Therefore it releases the page, returning it to the system.
It's worth noting that FastMM cannot release just 14 bytes to the system. It has to release memory in pages. Until the whole page is free, FastMM cannot do a thing. Nobody can.
So, why is my working set size so big, even though I released everything?
First, your working set size is not what you should be measuring. Virtual memory consumption is. But if you have big working set size, your virtual memory consumption will be high too.
What's the problem? You should be able to figure out by this point.
Let's say you allocate 1kb, then 3kb of memory. How much virtual memory have you allocated? 4kb, 1 page.
Now you release 3Kb. How much virtual memory do you use now? 1Kb? No, it's still 1 page. You cannot allocate less than 1 page from the system. You're still using 4096 bytes of virtual memory.
Imagine if you do that 1000 times. 1kb, 3kb, 1kb, 3kb, 1kb, 3kb and so on. You allocate 1000 * 4kb = 4 mb like that, and then you release all the 3kb parts. How much virtual memory do you use now?
Still 4 mb. Because you allocated 1000 pages at first. Of every page you took 1kb and 3kb chunks. Even if you release 3kb chunks, 1kb chunks will continue to keep every single page you allocated in memory. And every page takes 4kb of virtual memory.
Memory manager cannot magically "move" all of your 1kb chunks together. This is impossible, because their virtual addresses can be referenced from somewhere in code. It's not a trait of FastMM.
But why with BorlandMM everything works better?
Coincidence. Maybe it just so happens that BorlandMM gives you memory in a slightly different way than FastMM does. Next thing you know, you change something in your app and BorlandMM acts just like FastMM did. It's impossible for a memory manager to completely prevent this effect, called memory fragmentation.
So what do I do?
Short answer is, not much until this bothers you.
You see, with modern operating systems, you're not really eating anyone's RAM. Per above, OS will automatically swap your pages out when it needs RAM for other applications. This should not be a concern.
And the "excessive" memory isn't lost. Although pages are allocated, 3kb of each is marked as "free". Next time your app needs memory, memory manager will use that space.
But if you really want to help it, you should reorganize your allocations so that the ones you're planning on keeping are done first, and the ones you will soon release are all allocated after that.
Like this: 1kb, 1kb, 1kb, ..., 3kb, 3kb, 3kb...
If you now release all the 3kb chunks, your virtual memory consumption will drop significantly.
This is not always possible. If it's impossible, then just do nothing. It's more or less alright like it is.
And P.S.
You shouldn't be allocating 500 forms in the first place. This is clearly not a way to go. Fix this, and you won't even have a need to think about memory allocation and releasing.
I hope this clears things up, because four posts on the same topic, frankly, is a bit too much.

IIRC, the Delphi memory manager does not immediately return free'd memory to the OS.
Memory is allocated in chunks of small, medium and large sizes, called blocks.
These blocks are kept for a while after their contents have been disposed to have them readyly available when another allocation is requested afterwards.
This limits the amount of system calls required for succesive allocation of multiple objects, and helps avoiding heap fragmentation.

Infirming: Delphi 2007, default memory manager (should be FastMM variation). Several tests on heavy objects:
Initial memory 2Mb, peak memory 30Mb, final memory 4Mb.
Initial memory 2Mb, peak memory 1Gb, final memory 5.5Mb.

What are the heapmanager stats (GetHeapStatus) on the point that 160MB is still allocated?

SOLVED
To confirm that this behavior is generated by FastMM (as suggested by Barry Kelly) I created a second program that allocated A LOT of RAM. As soon as Windows ran out of RAM, my program memory utilization returned to its original value.
Problem solved. Special thanks to Barry Kelly, the only person that pointed to the real "problem".

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart