Is deallocation of multiple large bunches of memory worth it?

Say for instance I write a program which allocates a bunch of large objects when it is initialized. Then the program runs for awhile, perhaps indefinitely, and when it's time to terminate, each of the large initialized objects are freed.
So my question is, will it take longer to manually deallocate each block of memory separately at the end of the program's life or would it be better to let the system unload the program and deallocate all of the virtual memory given to the program by the system at the same time.
Would it be safe and/or faster? Also, if it is safe, does the compiler do this when set to optimise anyway?

1) Not all systems will free a memory for you when application terminates. Of course most of the modern desktop systems will do this, so if you are going to run your program only on Linux or Mac(or Windows), you can leave the deallocation to the system.
2) Often it is needed to make some operations with the data on termination, not just to free the memory. So if you are going to develop such program design that makes it hard to deallocate objects at the end manually, then it can happen that later you will need to perform some code before exiting and you will face up with hard problem.
2') Sometimes even if you think that your program will need some objects all the way until dead, later you may want to make a library from you program or change a project to load and unload you big objects and the poor design of your program will make this hard or impossible.
3) Moreover, the program deallocation performance depends on the implementation of the allocator you are going to use in your program. The system deallocation depends on the system memory management and even for a single system there can be several implementations. So if you face with allocation/deallocation performance problems - you would like to develop better allocator rather then hope on the system.
4) So my opinion is: When you deallocate memory manually at the end - you are always on a right way. When you don't do this, perhaps you can get some ambiguous benefits in several cases, but likely you will just face with the problems sooner or later.

Well most OS will free the memory at exit if the program, but the bigger question is why would you want it to have to?
Is it faster? Hard to say with memory sometimes. I would guess not really and definitely not worth breaking good coding practices anyway.
Is it safe? Define safe... Will your OS crash? Probably not. Will your code be susceptible to memory leaks or other problems? Absolutely, it will. In fact you are basically telling it you want memory leaks.
Best practice is to always free your memory when you are done with it. With C and C++, every malloced or new block of memory should have a corresponding free or delete.
It is a bad idea to rely on the OS to free your memory because it not only makes your code look bad and makes it less portable, but if the program was ever integrated into another program, then you will likely be tracking down memory leaks for hours.
So, short answer, always do it manually.

Programs with a short maintenance life time are good candidates for memory deallocation by "exit() and let the kernel sort them out." However, if the program will last more than a few months you have to consider the maintenance burden.
For instance, consider that someone may realize that a subsequent stage is required in the program, and some of the data is not needed, or not needed in memory. They now have to go and find out how to deallocate the memory, properly removing stale references, etc.


Use of Memory Profiling

Whats the meaning Memory Profiling?
Is it give statistics of memory like how much memory utilized?
And are there any different kinds in this?
The problem is, you may be doing way to much new-ing which, even in a language with a garbage-collector, may unnecessarily dominate your execution time.
You may also have a memory leak meaning that the amount of dynamic memory you're not returning to the pool grows steadily over time.
If your app runs for a long time, that's equally bad.
I use the random-pausing method for performance diagnosis, but that is of no value for finding memory leaks.
That's what Memory Profiling should help with.
Here's how I've found memory leaks in the past, using MFC.
In a debug build, when I shut down the app, it prints a list of all the non-collected memory blocks, along with their class type.
Then I look to see where those blocks are created, and try to figure out why they weren't deleted or collected.
It would be more useful if I could capture a stack trace on each block, so I could tell which new statement made it, and the stack could tell me why.
The point is, I could allocate 100 blocks of class Foo, and delete 99 of them.
The one that I don't delete is the problem, so it would be useful to know more about where it came from.
I don't know if memory profilers can do this or not.

How to do lua_pushstring and avoiding an out of memory setjmp exception

Sometimes, I want to use lua_pushstring in places after I allocated some resources which I would need to cleanup in case of failure. However, as the documentation seems to imply, lua_push* functions can always end up with an out of memory exception. But that exception instant-quits my C scope and doesn't allow me to cleanup whatever I might have temporarily allocated that might have to be freed in case of error.
Example code to illustrate the situation:
void* blubb = malloc(20);
...some other things happening here...
lua_pushstring(L, "test"); //how to do this call safely so I can still take care of blubb?
...possibly more things going on here...
Is there a way I can check beforehand if such an exception would happen and then avoid pushing and doing my own error triggering as soon as I safely cleaned up my own resources? Or can I somehow simply deactivate the setjmp, and then check some "magic variable" after doing the push to see if it actually worked or triggered an error?
I considered pcall'ing my own function, but even just pushing the function on the stack I want to call safely through pcall can possibly give me an out of memory, can't it?
To clear things up, I am specifically asking this for combined use with custom memory allocators that will prevent Lua from allocating too much memory, so assume this is not a case where the whole system has run out of memory.
Unless you have registered a user-defined memory handler with Lua when you created your Lua state, getting an out of memory error means that your entire application has run out of memory. Recovery from this state is generally not possible. Or at least, not feasible in a lot of cases. It could be depending on your application, but probably not.
In short, if it ever comes up, you've got bigger things to be concerned about ;)
The only kind of cleanup that should affect you is for things external to your application. If you have some process global memory that you need to free or set some state in. You're doing interprocess communication and you have some memory mapped file you're talking though. Or something like that.
Otherwise, it's probably better to just kill your process.
You could build Lua as a C++ library. When you do that, errors become actual exceptions, which you can either catch or just use RAII objects to handle.
If you're stuck with C... well, there's not much you can do.
I am specifically interested in a custom allocator that will out of memory much earlier to avoid Lua eating too much memory.
Then you should handle it another way. To signal an out-of-memory error is basically to say, "I want Lua to terminate right now."
The way to stop Lua from eating memory is to periodically check the Lua state's memory, and garbage collect it if it's using too much. And if that doesn't free up enough memory, then you should terminate the Lua state manually, but only when it is safe to do so.
lua_atpanic() may be one solution for you, depending on the kind of cleanup you need to do. It will never throw an error.
In your specific example you could also create blubb as a userdata. Then Lua would free it automatically when it left the stack.
I have recently gotten into some more Lua sandboxing again, and now I think the answer I accepted previously is a bad idea. I have given this some more thought:
Why periodic checking is not enough
Periodically checking for large memory consumption and terminating Lua "only when it is safe to do so" seems like a bad idea if you consider that a single huge table can eat up a lot of your memory with one single VM instruction about which you'll only find out after it happened - where your program might already be dying from it, and then you indeed have much bigger problems which you could have avoided entirely if you had stopped that allocation in time in the first place.
Since Lua has a nice out of memory exception already built-in, I would just like to use that one since this allows me to do the minimal required thing (preventing the script from allocating more stuff, while possibly allowing it to recover) without my C code breaking from it.
Therefore my current plan for Lua sandboxing with memory limit is:
Use custom allocator that returns NULL with limit
Design all C functions to be able to handle this without memory leak or other breakage
But how to design the C functions safely?
How to do that, since lua_pushstring and others can always setjmp away with an error without me knowing whether that is gonna happen in advance? (this was originally my question)
I think I found a working approach:
I added a facility to register pointers when I allocate them, and where I unregister them after I am done with them. This means if Lua suddenly setjmp's me out of my C code without me getting a chance to clean up, I have everything in a global list I need to clean up that mess later when I'm back in control.
Is that ugly or what?
Yes, it is quite the hack. But, it will most likely work, and unlike 'periodic checking' it will actually allow me to have a true hard limit and avoid getting the application itself trouble because of an aggressive attack.

When to call SetProcessWorkingSetSize? (Convincing the memory manager to release the memory)

In a previous post ( My program never releases the memory back. Why? ) I show that FastMM can cache (read as hold for itself) pretty large amounts of memory. If your application just loaded a large data set in RAM, after releasing the data, you will see that impressive amounts of RAM are not released back to the memory pool.
I looked around and it seems that calling the SetProcessWorkingSetSize API function will "flush" the cache to disk. However, I cannot decide when to call this function. I wanted to call it at the end of the OnClick event on the button that is performing the RAM intensive operation. However, some people are saying that this may cause AV.
If anybody used this function successfully, please let me (us) know.
Many thanks.
1. After releasing the data set, the program still takes large amounts of RAM. After calling SetProcessWorkingSetSize the size returns to few MB. Some argue that nothing was released back. I agree. But the memory foot print is now small AND it is NOT increasing back after using the program normally (for example when performing normal operation that does not involves loading large data sets). Unfortunately, there is no way to demonstrate that the memory swapped to disk is ever loaded back into memory, but I think it is not.
2. I have already demonstrated (I hope) this is not a memory leak:
My program never releases the memory back. Why?
How to convince the memory manager to release unused memory
If SetProcessWorkingSetSize would solve your problem, then your problem is not that FastMM is keeping hold of memory. Since this function will just trim the workingset of your application by writing the memory in RAM to the page file. Nothing is released back to Windows.
In fact you only have made accessing the memory again slower, since it now has to be read from disc. This method has the same effect as minimising your application. Then Windows presumes you are not going to use the application again soon and also writes the workingset in RAM to the pagefile. Windows does a good job of deciding when to write RAM to the pagefile and tries to keep the most used memory in RAM as long as it can. It will make the workinset size smaller (write to pagefile) when there is little RAM left. I would not mess with it just to give the illusion that you program is using less memory while in fact it is using just as much as before, only now it is slower to access. Memory that is accessed again will be loaded into RAM again and make the workinset size grow again. Touching less memory keeps the workingset size smaller.
So no, this will not help you forcing FastMM to release the memory. If your goal is for your application to use less memory you should look elsewhere. Look for leaks, look for heap fragmentations look for optimisations and if you think FastMM is keeping you from doing so you should try to find facts to support it. If your goal is to keep your workinset size small you could try to keep your memory access local. Maybe FastMM or another memory manager could help you with it, but it is a very different problem compared to using to much memory. And maybe this function does help you solve the problem you are having, but I would use it with care and certainly not use it just to keep the illusion that your program has a low memory usage.
I agree with Lars Truijens 100%, if you don't than you can check the FasttMM memory usage via FasttMM calls GetMemoryManagerState and GetMemoryManagerUsageSummary before and after calling API SetProcessWorkingSetSize.
Are you sure there is a problem? Working sets might only decrease when there really is a memory shortage.
Problem solved:
I don't need to use SetProcessWorkingSetSize. FastMM will eventually release the RAM.
To confirm that this behavior is generated by FastMM (as suggested by Barry Kelly) I crated a second program that allocated A LOT of RAM. As soon as Windows ran out of RAM, my program memory utilization returned to its original value.
I used this function just once, when I implemented TWebBrowser. This component took me so much memory even if I destroyed the instance.

Can a Memcached daemon ever free() unused memory, without terminating the process?

I believe that you can't force a running Memcached instance to de-allocate memory, short of terminating that Memcached instance (and freeing all of the memory it held). Does anyone know of a definitive piece of documentation, or even a mailing list or blog posting from a reliable source, that can confirm or deny this impression?
As I understand it, a Memcached process initially allocates a chunk of memory (the exact initial allocation size is configurable), and then monotonically increases its memory utilization over its lifetime, limited by the daemon's maximum memory allocation size (also configurable). At no point does the Memcached daemon ever free any memory, regardless of whether the daemon has any ongoing need for the memory it holds.
I know that this question might sound a little whiny, with a tone of "I DEMAND that open source project X support my specific need!" That's not it, at all--I'm purely interested in the exact technical answer, here, and I swear I'm not harshing on Memcached. For the curious, this question came out of a discussion about possible methods for gracefully juggling multiple Memcached instances on a single server, given an application where the cost of a cache flush can be quite high.
However, I'd appreciate it if you save your application suggestions/advice for a different question (re-architecting my application, using a different caching implementation, etc.). I do appreciate a good brainstorm, but I think this question will be most valuable if it stays focused on the technical specifics of how Memcached does and does not work. If you don't have the answer to this specific question, there is probably still value in what you have to say, but I'd guess that there's a different, better place to post the more speculative comments/suggestions/advice.
This is probably the hardest problem we have to solve for memcached currently (well, a variation of it, anyway).
Freeing a chunk of memory requires us to know that a) nothing within the chunk is in use and b) nothing will start using it while we're in the process of purging it for reuse/freeing. I've heard some really good ideas for how we might solve our slab rebalancing problems which is basically the same, except we're not trying to free the memory, but to give it to something else (a common problem in a few large installations).
Also, whether free actually reduces the RSS of your process is implementation dependent. In many cases, a malloc/fill/free will leave the memory mapped in (unless your allocator uses mmap instead of sbrk).
I'm pretty sure this isn't possible with memcached. I don't see any technical reason why it couldn't be implemented though. Lock cache operations, expire enough keys to reach the desired size, update the size, unlock. (I'm sure there's nicer ways to avoid blocking the server during that time.)
The standard and default mechanism of memory management in memcached is slab allocator. It means that memory is being allocated for the process and never released to the operating system. Basically, when memory is no longer used to store some data, it is being held by the process in order to be reused later, when needed. However, the operating system releases memory allocated by the process when it is finished. That is why memory is being released when you kill/stop the memcached.
There is a compile-time option in memcached to enable malloc/free mechanism. So that when free() is called, memory might be released to operating system (this depends on C standard library implementation). But doing so might hurt a good fragmentation and performance.
Please read more about the issue here:
Why not use malloc/free
Memcached memory management

Coping with, and minimizing, memory usage in Common Lisp (SBCL)

I have a VPS with not very much memory (256Mb) which I am trying to use for Common Lisp development with SBCL+Hunchentoot to write some simple web-apps. A large amount of memory appears to be getting used without doing anything particularly complex, and after a while of serving pages it runs out of memory and either goes crazy using all swap or (if there is no swap) just dies.
So I need help to:
Find out what is using all the memory (if it's libraries or me, especially)
Limit the amount of memory which SBCL is allowed to use, to avoid massive quantities of swapping
Handle things cleanly when memory runs out, rather than crashing (since it's a web-app I want it to carry on and try to clean up).
I assume the first two are reasonably straightforward, but is the third even possible?
How do people handle out-of-memory or constrained memory conditions in Lisp?
(Also, I note that a 64-bit SBCL appears to use literally twice as much memory as 32-bit. Is this expected? I can run a 32-bit version if it will save a lot of memory)
To limit the memory usage of SBCL, use --dynamic-space-size option (e.g.,sbcl --dynamic-space-size 128 will limit memory usage to 128M).
To find out who is using memory, you may call (room) (the function that tells how much memory is being used) at different times: at startup, after all libraries are loaded and then during work (of cource, call (sb-ext:gc :full t) before room not to measure the garbage that has not yet been collected).
Also, it is possible to use SBCL Profiler to measure memory allocation.
Find out what is using all the memory
(if it's libraries or me, especially)
Attila Lendvai has some SBCL-specific code to find out where an allocated objects comes from. Refer to and write him a private mail if needed.
Be sure to try another implementation, preferably with a precise GC (like Clozure CL) to ensure it's not an implementation-specific leak.
Limit the amount of memory which SBCL
is allowed to use, to avoid massive
quantities of swapping
Already answered by others.
Handle things cleanly when memory runs
out, rather than crashing (since it's
a web-app I want it to carry on and
try to clean up).
256MB is tight, but anyway: schedule a recurring (maybe 1s) timed thread that checks the remaining free space. If the free space is less than X then use exec() to replace the current SBCL process image with a new one.
If you don't have any type declarations, I would expect 64-bit Lisp to take twice the space of a 32-bit one. Even a plain (small) int will use a 64-bit chunk of memory. I don't think it'll use less than a machine word, unless you declare it.
I can't help with #2 and #3, but if you figure out #1, I suspect it won't be a problem. I've seen SBCL/Hunchentoot instances running for ages. If I'm using an outrageous amount of memory, it's usually my own fault. :-)
I would not be surprised by a 64-bit SBCL using twice the meory, as it will probably use a 64-bit cell rather than a 32-bit one, but couldn't say for sure without actually checking.
Typical things that keep memory hanging around for longer than expected are no-longer-useful references that still have a path to the root allocation set (hash tables are, I find, a good way of letting these things linger). You could try interspersing explicit calls to GC in your code and make sure to (as far as possible) not store things in global variables.
