Memory related errors - memory

I mostly work on C language for my work. I have faced many issues and spent lot time in debugging issues related to dynamically allocated memory corruption/overwriting. Like malloc(A) A bytes but use write more than A bytes. Towards that i was trying to read few things when i read about :-
1.) An approach wherein one allocates more memory than what is needed. And write some known value/pattern in that extra locations. Then during program execution that pattern should be untouched, else it indicated memory corruption/overwriting. But how does this approach work. Does it mean for every write to that pointer which is allocated using malloc() i should be doing a memory read of the additional sentinel pattern and read for its sanity? That would make my whole program very slow.
And to say that we can remove these checks from the release version of the code, is also not fruitful as memory related issues can happen more in 'real scenario'. So can we handle this?
2.) I heard that there is something called HEAP WALKER, which enables programs to detect memory related issues? How can one enable this.
thank you.
-AD.

If you're working under Linux or OSX, have a look at Valgrind (free, available on OSX via Macports). For Windows, we're using Rational PurifyPlus (needs a license).
You can also have a look at Dmalloc or even at Paul Nettle's memory manager which helps tracking memory allocation related bugs.

If you're on Mac OS X, there's an awesome library called libgmalloc. libgmalloc places each memory allocation on a separate page. Any memory access/write beyond the page will immediately trigger a bus error. Note however that running your program with libgmalloc will likely result in a significant slowdown.

Memory guards can catch some heap corruption. It is slower (especially deallocations) but it's just for debug purposes and your release build would not include this.
Heap walking is platform specific, but not necessarily too useful. The simplest check is simply to wrap your allocations and log them to a file with the LINE and FILE information for your debug mode, and most any leaks will be apparent very quickly when you exit the program and numbers don't tally up.
Search google for LINE and I am sure lots of results will show up.

Related

How to analyze excessive memory consumption (PageFileUsage) in a Delphi Application?

This is a follow-up to this question: What could explain the difference in memory usage reported by FastMM or GetProcessMemoryInfo?
My Delphi XE application is using a very large amount of memory which sometimes lead to an out of memory exception. I'm trying to understand why and what is causing this memory usage and while FastMM is reporting low memory usage, when requesting for TProcessMemoryCounters.PageFileUsage I can clearly see that a lot of memory is used by the application.
I would like to understand what is causing this problem and would like some advise on how to handle it:
Is there a way to know what is contained in that memory and where it has been allocated ?
Is there some tool to track down memory usage by line/procedure in a Delphi application ?
Any general advise on how to handle such a problem ?
EDIT 1 : Here are two screenshots of FastMMUsageTracker indicating that memory has been allocate by the system.
Before process starts:
After process ends:
Legend: Light red is FastMM allocated and dark gray is system allocated.
I'd like to understand what is causing the system to use that much memory. Probably by understanding what is contained in that memory or what line of code or procedure did cause that allocation.
EDIT 2 : I'd rather not use the full version of AQTime for multiple reasons:
I'm using multiple virtual machines for development and their licensing system is a PITA (I'm already a registered user of TestComplete)
LITE version doesn't provide enough information and I won't waste money without making certain the FULL version will give me valuable information
Any other suggestions ?
Another problem might be heap fragmentation. This means you have enough memory free, but all the free blocks are to small. You might see it visually by using the source version of FastMM and use the FastMMUsageTracker.pas as suggested here.
You need a profiler, but even that won't be enough in lots of places and cases. Also, in your case, you would need the full featured AQTime, not the lite version that comes with Delphi XE and XE2. (AQTIME is extremely expensive, and annoyingly node-locked, so don't think I'm a shill for SmartBear software.)
The thing is that people often mistake AQTime Allocation Profiler as only a way to find leaks. It can also tell you where your memory goes, at least within the limits of the tool. While running, and consuming lots of memory, I click Run -> Get Results.
Here is one of my applications being profile in AQTime with its Allocation Profiler showing exactly what class is allocating how many instances on the heap and how much memory those use. Since you report low Delphi heap usage with FastMM, that tells me that most of AQTime's ability to analyze by delphi class name will also be useless to you. However by using AQTime's events and triggers, you might be able to figure out what areas of your application are causing you a "memory usage expense" and when those occur, what the expense is. AQTime's real-time instrumentation may be sufficient to help you narrow down the cause even though it might not find for you what function call is causing the most memory usage automatically.
The column names include "Object Name" which includes things like this:
* All delphi classes, and their instance count and heap usage.
* Virtual Memory blocks allocated via Win32 calls.
It can detect Delphi and C/C++ library allocations on the heap, and can see certain Windows-API level memory allocations.
Note the live count of objects, the amount of memory from the heap that is used.
I usually try to figure out the memory cost of a particular operation by measuring heap memory use before, and just after, some expensive operation, but before the cleanup (freeing) of the memory from that expensive operation. I can set event points inside AQTime and when a particular method gets hit or a flag gets turned on by me, I can measure before, and after values, and then compare them.
FastMM alone can not even detect a non-delphi allocation or an allocation from a heap that is not being managed by FastMM. AQTime is not limited in that way.

Strategy or tools to find "non-leak" memory usage problems in Delphi?

One old application started to consume memory a lot after server update. Memory usage seems to rise with out limit until program hangs.
According to FastMM4 and EurekaLog, there's no memory leak (except 28 bytes), so I assume all memory is freed when application is shutdown.
Are there any tools or strategies suitable for tracking this kind of memory problem?
Since September 2012, there is a very simple and comfortable way to find this type of "run-time only" memory leaks.
FastMM4991 introduced a new method, LogMemoryManagerStateToFile:
Added the LogMemoryManagerStateToFile call. This call logs a summary of
the memory manager state to file: The total allocated memory, overhead,
efficiency, and a breakdown of allocated memory by class and string type.
This call may be useful to catch objects that do not necessarily leak, but
do linger longer than they should.
To discover the leak at run time, you only need these steps
add a call to LogMemoryManagerStateToFile('memory.log', '') in a place where it will be called in intervals
run the application
open the log file with a tail program (for example BareTail), which will auto-refresh when the file content changes
watch the first lines of the file, they will contain the memory allocations which occupy the highest amount of memory
if you see a class or memory type constantly has a growing number of instances, this can be the reason of your leak
The growing memory consumption is an application issue. It is not a bug, which can discover FastMM4 or EurekaLog. As from they point of view - application just correctly uses the memory.
Using AQTime, MemProof (hard to find, D7 is last supported version (?)), SleuthQA (similar to MemProof) or similar memory profilers, you can track the memory usage outside of application in real-time.
Using FastMM4, GetMemoryManagerState / GetMemoryManagerUsageSummary you can track memory usage from application. Output this information into trace file and analyze it after run. Or make simple wrapping function for one of the above procedures, which will return curent memory usage. And call it from IDE Debugger Evalute / Modify, add to Watches or call OutputDebugString, and see the current memory usage.
Note, if memory is eated by some DLL then you may not see her memory usage using (3). Use (2).
Analyzing the memory usage and the tasks performed by the application, you may discover what leads to raised memory usage.
AQTime (a commercial tool which is quite expensive) can report your memory usage, down to the line of source code that allocated each object. In the case of very large memory usage scenarios, you might want the AQTime functionality that can show the number of objects and the size (total plus individual instance size) for each object. AQTime worked great for me, starting with Delphi 7, and all later versions, including your version (2006) and the latest versions (XE and XE2).
As the program memory usage grows, AQTime can be used to grab "snapshots" of the runtime heap, you can use to understand memory usage of your application; What is being created, and how many of each object exists. Even when no leaks exist, understanding the runtime behaviour of your application in terms of the objects it creates and manages, is very important, and AQTime is the most powerful tool I know of for Delphi users.
If you are willing to upgrade to Delphi XE/XE2, you might have an included light version of AQTime already, if so, check it out. If not, I recommend you try their demo. I am unaware of any free or open source alternatives that can provide the same functionality.
Lesser functionality could be cobbled together manually by writing lots of trace messages, or using the FastMM full-debug-mode. If you could write a complete dump of your memory usage into a very large file, you might be able to write some tools to parse, and create a summary. The problem I have with FastMM in this case, is that you will be drowned in detail information, without the ability to extract exactly the summary information that helps you understand your situation. So, you can try to write your own tool to summarize the memory usage. In one application I had that used a series of components that I knew would use a lot of memory, I wrote a dialog box into my application that showed current memory usage by these large memory-blob-of-data objects.
Have you ever think about the Leak that is causing the IDE... it is so huge!!!
In my case (2GB of RAM) i do the next...
1. Open the IDE
2. Leave it minimized for near six hours
3. See how Physical memory is getting used
The result:
While IDE is oppened (remember i also do the test having it minimized) it is getting more and more RAM... till no more ram free.
It gets all 2GB RAM + all Pagefile hard disk space (i have it configured to a mas of 4GB)
In less that six hours (doing nothing on IDE) it tries to use more than 6GB.
That is called a Memory Leak casused by the IDE... i do not type any letter on IDE, do not compile anything, do not even open any project... just open IDE and minimize it... leave the computer without doing anything on it for about six hours and IDE is consuming 6GB of memory.
Of course, after that, the IDE start with annoying messages of SystemOutOfMemory... and i must kill it... then all that 6GB are freed!!!
When on the hell will this get fixed?
Please note i have all patches applied, i also tested without applying each patch/hotfix, etc...
The best i got was dissabling some options on Tools, like the one that underlines bad code, etc... so why on the hell that option has any influence... i am not typing anything on the IDE (on the tests)... and if i have it dissabled the memory leak gets reduced a lot...
Of course, if i use the IDE (write code on an opened project) without even compiling / running it... the thing goes much more worst... memory leak upto 6GB can got reached on less than an hour, sometimes occurs after 15 minutes of Copy/Paste source code.
Seems there will not be a solution in a short time!!!
So i got the next solution that works perfect:
-Close the IDE an reopen it each 15 minutes or less
Ugly solution, i know... but works!!!

Delphi: FastMM virtual memory management reference?

I had an issue recently (see my last question) that led me to take a closer look at the memory management in my Delphi application. After my first exploration, I have two questions.
I've started playing with the FastMMUsageTracker, and noticed the following. When I open a file to be used by the app (which also creates a form etc...), there is a significant discrepancy between the variation in available virtual memory for the app, and the variation in "FastMM4 allocated" memory.
First off, I'm a little confused by the terminology: why is there some FastMM-allocated memory and some "System-allocated" (and reserved) memory? Since FastMM is the memory manager, why is the system in charge of allocating some of the memory?
Also, how can I get more details on what objects/structures have been allocated that memory? The VM chart is only useful in showing the amount of memory that is "system allocated", "system reserved", or "FastMM allocated", but there is no link to the actual objects requiring that memory. Is it possible for example to get a report, mid-execution, similar to what FastMM generates upon closing the application? FastMM obviously stores that information somewhere.
As a bonus for me, if people can recommend a good reference (book, website) on the subject, it would also be much appreciated. There are tons of info on the net, but it's usually very case-specific and experts-oriented.
Thanks!
PS: This is not about finding leaks, no problem there, just trying to understand memory management better and be pre-emptive for the future, as our application uses more and more memory.
Some of your questions are easy. Well, one of them anyway!
Why is there some FastMM-allocated
memory and some "System-allocated"
(and reserved) memory? Since FastMM is
the memory manager, why is the system
in charge of allocating some of the
memory?
The code that you write in Delphi is only part of what runs in your process. You use 3rd party libraries in the form of DLLs, most notably the Windows API. Anytime you create a Delphi form, for example, there are a lot of windows objects behind it that consume memory. This memory does not get allocated by FastMM and I presume is what is termed "system-allocated" in your question.
However, if you want to go any deeper then this very rapidly becomes an extremely complex topic. If you do want to go deeper into the implementation of Windows memory management then I think you need to consult a serious reference source. I suggest Windows Internals by Mark Russinovich, David Solomon and Alex Ionescu.
First off, I'm a little confused by the terminology: why is there some FastMM-allocated memory and some "System-allocated" (and reserved) memory? Since FastMM is the memory manager, why is the system in charge of allocating some of the memory?
Where do you suppose FastMM gets the memory to allocate? It comes from the system, of course.
When your app starts up, FastMM gets a block of memory from the system. When you ask for some memory to use (whether with GetMem, New, or TSomething.Create), FastMM tries to give it to you from that first initial block. If there's not enough there, FastMM asks for more (in one block if possible) from the system, and returns a chunk of that to you. When you free something, FastMM doesn't return that memory to the OS, because it figures you'll use it again. It just marks it as unused internally. It also tries to realign unused blocks so that they're as contiguous as possible, in order to try not to have to go back to the OS for more needlessly. (This realignment isn't always possible, though; that's where you end up with memory fragmentation from things like multiple resizing of dynamic arrays, lots of object creates and frees, and so forth.)
In addition to the memory FastMM manages in your app, the system sets aside room for the stack and heap. Each process gets a meg of stack space when it starts up, as room to put variables. This stack (and the heap) can grow dynamically as needed.
When your application exits, all of the memory it's allocated is released back to the OS. (It may not appear so immediately in Task Manager, but it is.)
Is it possible for example to get a report, mid-execution, similar to what FastMM generates upon closing the application?
Not as far as I can tell. Because FastMM stores it somewhere doesn't necessarily mean there's a way to access it during runtime from outside the memory manager. You can look at the source for FastMMUsageTracker to see how the information is retrieved (using GetMemoryManagerState and GetMemoryMap, in the RefreshSnapshot method). The source to FastMM4 is also available; you can look and see what public methods are available.
FastMM's own documentation (in the form of the readme files, FastMM4Options.inc comments, and the FastMM4_FAQ.txt file) is useful to some extent in explaining how it works and what debugging options (and information) is available.
For a detailed map of what memory a process is using, try VMMAP from www.sysinternals.com (also co-authored by Mark Russinovich, mentioned in David's answer). This also allows you to see what is stored in some of the locations (type control-T when a detail line is selected).
Warning: there is much more memory in use by your process than you might think. You may need to read the book first.

How to get the root cause of a memory corruption in a embedded environment?

I have detected a memory corruption in my embedded environment (my program is running on a set top box with a proprietary OS ). but I couldn't get the root cause of it.
the memory corruption , itself, is detected after a stress test of launching and exiting an application multiple times. giving that I couldn't set a memory break point because the corruptued variable is changing it's address every time that the application is launched, is there any idea to catch the root cause of this corruption?
(A memory break point is break point launched when the environment change the value of a giving memory address)
note also that all my software is developed using C language.
Thanks for your help.
These are always difficult problems on embedded systems and there is no easy answer. Some tips:
Look at the value the memory gets corrupted with. This can give a clear hint.
Look at datastructures next to your memory corruption.
See if there is a pattern in the memory corruption. Is it always at a similar address?
See if you can set up the memory breakpoint at run-time.
Does the embedded system allow memory areas to be sandboxed? Set-up sandboxes to safeguard your data memory.
Good luck!
Where is the data stored and how is it accessed by the two processes involved?
If the structure was allocated off the heap, try allocating a much larger block and putting large guard areas before and after the structure. This should give you an idea of whether it is one of the surrounding heap allocations which has overrun into the same allocation as your structure. If you find that the memory surrounding your structure is untouched, and only the structure itself is corrupted then this indicates that the corruption is being caused by something which has some knowledge of your structure's location rather than a random memory stomp.
If the structure is in a data section, check your linker map output to determine what other data exists in the vicinity of your structure. Check whether those have also been corrupted, introduce guard areas, and check whether the problem follows the structure if you force it to move to a different location. Again this indicates whether the corruption is caused by something with knowledge of your structure's location.
You can also test this by switching data from the heap into a data section or visa versa.
If you find that the structure is no longer corrupted after moving it elsewhere or introducing guard areas, you should check the linker map or track the heap to determine what other data is in the vicinity, and check accesses to those areas for buffer overflows.
You may find, though, that the problem does follow the structure wherever it is located. If this is the case then audit all of the code surrounding references to the structure. Check the contents before and after every access.
To check whether the corruption is being caused by another process or interrupt handler, add hooks to each task switch and before and after each ISR is called. The hook should check whether the contents have been corrupted. If they have, you will be able to identify which process or ISR was responsible.
If the structure is ever read onto a local process stack, try increasing the process stack and check that no array overruns etc have occurred. Even if not read onto the stack, it's likely that you will have a pointer to it on the stack at some point. Check all sub-functions called in the vicinity for stack issues or similar that could result in the pointer being used erroneously by unrelated blocks of code.
Also consider whether the compiler or RTOS may be at fault. Try turning off compiler optimisation, and failing that inspect the code generated. Similarly consider whether it could be due to a faulty context switch in your proprietary RTOS.
Finally, if you are sharing the memory with another hardware device or CPU and you have data cache enabled, make sure you take care of this through using uncached accesses or similar strategies.
Yes these problems can be tough to track down with a debugger.
A few ideas:
Do regular code reviews (not fast at tracking down a specific bug, but valuable for catching such problems in general)
Comment-out or #if 0 out sections of code, then run the cut-down application. Try commenting-out different sections to try to narrow down in which section of the code the bug occurs.
If your architecture allows you to easily disable certain processes/tasks from running, by the process of elimination perhaps you can narrow down which process is causing the bug.
If your OS is a cooperative multitasking e.g. round robin (this would be too hard I think for preemptive multitasking): Add code to the end of the task that "owns" the structure, to save a "check" of the structure. That check could be a memcpy (if you have the time and space), or a CRC. Then after every other task runs, add some code to verify the structure compared to the saved check. This will detect any changes.
I'm assuming by your question you mean that you suspect some part of the proprietary code is causing the problem.
I have dealt with a similar issue in the past using what a colleague so tastefully calls a "suicide note". I would allocate a buffer capable of storing a number of copies of the structure that is being corrupted. I would use this buffer like a circular list, storing a copy of the current state of the structure at regular intervals. If corruption was detected, the "suicide note" would be dumped to a file or to serial output. This would give me a good picture of what was changed and how, and by increasing the logging frequency I was able to narrow down the corrupting action.
Depending on your OS, you may be able to react to detected corruption by looking at all running processes and seeing which ones are currently holding a semaphore (you are using some kind of access control mechanism with shared memory, right?). By taking snapshots of this data too, you perhaps can log the culprit grabbing the lock before corrupting your data. Along the same lines, try holding the lock to the shared memory region for an absurd length of time and see if the offending program complains. Sometimes they will give an error message that has important information that can help your investigation (for example, line numbers, function names, or code offsets for the offending program).
If you feel up to doing a little linker kung fu, you can most likely specify the address of any statically-allocated data with respect to the program's starting address. This might give you a consistent-enough memory address to set a memory breakpoint.
Unfortunately, this sort of problem is not easy to debug, especially if you don't have the source for one or more of the programs involved. If you can get enough information to understand just how your data is being corrupted, you may be able to adjust your structure to anticipate and expect the corruption (sometimes needed when working with code that doesn't fully comply with a specification or a standard).
You detect memory corruption. Could you be more specific how? Is it a crash with a core dump, for example?
Normally the OS will completely free all resources and handles your program has when the program exits, gracefully or otherwise. Even proprietary OSes manage to get this right, although its not a given.
So an intermittent problem could seem to be triggered after stress but just be chance, or could be in the initialisation of drivers or other processes the program communicates with, or could be bad error handling around say memory allocations that fail when the OS itself is under stress e.g. lazy tidying up of the closed programs.
Printfs in custom malloc/realloc/free proxy functions, or even an Electric Fence -style custom allocator might help if its as simple as a buffer overflow.
Use memory-allocation debugging tools like ElectricFence, dmalloc, etc - at minimum they can catch simple errors and most moderately-complex ones (overruns, underruns, even in some cases write (or read) after free), etc. My personal favorite is dmalloc.
A proprietary OS might limit your options a bit. One thing you might be able to do is run the problem code on a desktop machine (assuming you can stub out the hardware-specific code), and use the more-sophisticated tools available there (i.e. guardmalloc, electric fence).
The C library that you're using may include some routines for detecting heap corruption (glibc does, for instance). Turn those on, along with whatever tracing facilities you have, so you can see what was happening when the heap was corrupted.
First I am assuming you are on a baremetal chip that isn't running Linux or some other POSIX-capable OS (if you are there are much better techniques such as Valgrind and ASan).
Here's a couple tips for tracking down embedded memory corruption:
Use JTAG or similar to set a memory watchpoint on the area of memory that is being corrupted, you might be able to catch the moment when memory being is accidentally being written there vs a correct write, many JTAG debuggers include plugins for IDEs that allow you to get stack traces as well
In your hard fault handler try to generate a call stack that you can print so you can get a rough idea of where the code is crashing, note that since memory corruption can occur some time before the crash actually occurs the stack traces you get are unlikely to be helpful now but with better techniques mentioned below the stack traces will help, generating a backtrace on baremetal can be a very difficult task though, if you so happen to be using a Cortex-M line processor check this out https://github.com/armink/CmBacktrace or try searching the web for advice on generating a back/stack trace for your particular chip
If your compiler supports it use stack canaries to detect and immediately crash if something writes over the stack, for details search the web for "Stack Protector" for GCC or Clang
If you are running on a chip that has an MPU such as an ARM Cortex-M3 then you can use the MPU to write-protect the region of memory that is being corrupted or a small region of memory right before the region being corrupted, this will cause the chip to crash at the moment of the corruption rather than much later

How to log mallocs

This is a bit hypothetical and grossly simplified but...
Assume a program that will be calling functions written by third parties. These parties can be assumed to be non-hostile but can't be assumed to be "competent". Each function will take some arguments, have side effects and return a value. They have no state while they are not running.
The objective is to ensure they can't cause memory leaks by logging all mallocs (and the like) and then freeing everything after the function exits.
Is this possible? Is this practical?
p.s. The important part to me is ensuring that no allocations persist so ways to remove memory leaks without doing that are not useful to me.
You don't specify the operating system or environment, this answer assumes Linux, glibc, and C.
You can set __malloc_hook, __free_hook, and __realloc_hook to point to functions which will be called from malloc(), realloc(), and free() respectively. There is a __malloc_hook manpage showing the prototypes. You can add track allocations in these hooks, then return to let glibc handle the memory allocation/deallocation.
It sounds like you want to free any live allocations when the third-party function returns. There are ways to have gcc automatically insert calls at every function entrance and exit using -finstrument-functions, but I think that would be inelegant for what you are trying to do. Can you have your own code call a function in your memory-tracking library after calling one of these third-party functions? You could then check if there are any allocations which the third-party function did not already free.
First, you have to provide the entrypoints for malloc() and free() and friends. Because this code is compiled already (right?) you can't depend on #define to redirect.
Then you can implement these in the obvious way and log that they came from a certain module by linking those routines to those modules.
The fastest way involves no logging at all. If the amount of memory they use is bounded, why not pre-allocate all the "heap" they'll ever need and write an allocator out of that? Then when it's done, free the entire "heap" and you're done! You could extend this idea to multiple heaps if it's more complex that that.
If you really do need to "log" and not make your own allocator, here's some ideas. One, use a hash table with pointers and internal chaining. Another would be to allocate extra space in front of every block and put your own structure there containing, say, an index into your "log table," then keep a free-list of log table entries (as a stack so getting a free one or putting a free one back is O(1)). This takes more memory but should be fast.
Is it practical? I think it is, so long as the speed-hit is acceptable.
You could run the third party functions in a separate process and close the process when you are done using the library.
A better solution than attempting to log mallocs might be to sandbox the functions when you call them—give them access to a fixed segment of memory and then free that segment when the function is done running.
Unconfined, incompetent memory usage can be just as damaging as malicious code.
Can't you just force them to allocate all their memory on the stack? This way it would be garanteed to be freed after the function exits.
In the past I wrote a software library in C that had a memory management subsystem that contained the ability to log allocations and frees, and to manually match each allocation and free. This was of some use when attempting to find memory leaks, but it was difficult and time consuming to use. The number of logs was overwhelming, and it took an extensive amount of time to understand the logs.
That being said, if your third party library has extensive allocations, its more then likely impractical to track this via logging. If you're running in a Windows environment, I would suggest using a tool such as Purify[1] or BoundsChecker[2] that should be able to detect leaks in your third party libraries. The investment in the tool should pay for itself in time saved.
[1]: http://www-01.ibm.com/software/awdtools/purify/ Purify
[2]: http://www.compuware.com/products/devpartner/visualc.htm BoundsChecker
Since you're worried about memory leaks and talking about malloc/free, I assume you're in C. I'm also assuming based on your question that you do not have access to the source code of the third party library.
The only thing I can think of is to examine memory consumption of your app before & after the call, log error messages if they're different and convince the third party vendor to fix any leaks you find.
If you have money to spare, then consider using Purify to track issues. It works wonders, and does not require source code or recompilation. There are also other debugging malloc libraries available that are cheaper. Electric Fence is one name I recall. That said, the debugging hooks mentioned by Denton Gentry seem interesting too.
If you're too poor for Purify, try Valgrind. It it a lot better than it was 6 years ago and a lot easier to dive into than Purify.
Microsoft Windows provides (use SUA if you need a POSIX), quite possibly, the most advanced heap+(other api known to use the heap) infrastructure of any shipping OS today.
the __malloc() debug hooks and the associated CRT debug interfaces are nice for cases where you have the source code to the tests, however they can often miss allocations by standard libraries or other code which is linked. This is expected as they are the Visual Studio heap debugging infrastructure.
gflags is a very comprehensive and detailed set of debuging capabilities which has been included with Windows for many years. Having advanced functionality for source and binary only use cases (as it is the OS heap debugging infrastructure).
It can log full stack traces (repaginating symbolic information in a post-process operation), of all heap users, for all heap modifying entrypoint's, serially if needed. Also, it may modify the heap with pathalogical cases which may align the allocation of data such that the page protection offered by the VM system is optimally assigned (i.e. allocate your requested heap block at the end of a page, so even a singele byte overflow is detected at the time of the overflow.
umdh is a tool which can help assess the status at various checkpoints, however the data is continually accumulated during the execution of the target o it is not a simple checkpointing debug stop in the traditional context. Also, WARNING, Last I checked at least, the total size of the circular buffer which store's the stack information, for each request is somewhat small (64k entries (entries+stack)), so you may need to dump rapidly for heavy heap users. There are other ways to access this data but umdh is fairly simple.
NOTE there are 2 modes;
MODE 1, umdh {-p:Process-id|-pn:ProcessName} [-f:Filename] [-g]
MODE 2, umdh [-d] {File1} [File2] [-f:Filename]
I do not know what insanity gripped the developer who chose to alternate between -p:foo argument specifier's and naked ordering of argument's but it can get a little confusing.
The debugging sdk works with a number of other tools, memsnap is a tool which apparently focuses on memory leask and such, but I have not used it, your milage may vary.
Execute gflags with no arguments for the UI mode, +arg's and /args are different "modes" of use also.
On Linux I've successfully used mtrace(3) to log allocations and freeings. Its usage is as simple as
Modify your program to call mtrace() when you need to begin tracing (e.g. at the top of main()),
Set environment variable MALLOC_TRACE to the file path where the trace should be saved and run the program.
After that the output file will contain something like this (excerpt from the middle to show a failed allocation):
# /usr/lib/tls/libnvidia-tls.so.390.116:[0xf44b795c] + 0x99e5e20 0x49
# /opt/gcc-7/lib/libstdc++.so.6:(_ZdlPv+0x18)[0xf6a80f78] - 0x99beba0
# /usr/lib/tls/libnvidia-tls.so.390.116:[0xf44b795c] + 0x9a23ec0 0x10
# /opt/gcc-7/lib/libstdc++.so.6:(_ZdlPv+0x18)[0xf6a80f78] - 0x9a23ec0
# /opt/Xorg/lib/video-libs/libGL.so.1:[0xf668ee49] + 0x99c67c0 0x8
# /opt/Xorg/lib/video-libs/libGL.so.1:[0xf668f14f] - 0x99c67c0
# /opt/Xorg/lib/video-libs/libGL.so.1:[0xf668ee49] + (nil) 0x30000000
# /lib/libc.so.6:[0xf677f8eb] + 0x99c21f0 0x158
# /lib/libc.so.6:(_IO_file_doallocate+0x91)[0xf677ee61] + 0xbfb00480 0x400
# /lib/libc.so.6:(_IO_setb+0x59)[0xf678d7f9] - 0xbfb00480

Resources