Finalization Reachable Table - clr

If I implement a destructor in a class, Foo, instances of Foo are tracked closely on the finalization queue. When an instance of Foo is garbage collected, I understand that the CLR sees the entry in the finalization queue and gives that object special treatment by moving the object off the heap and into the finalization reachable table. Then... nothing else happens for that garbage collection cycle?
Will finalize() always be called during the next garbage collection cycle?
Why isn't finalize called immediately after copying my object to the freachable table? (this seems like extra unnecessary complexity)

The finalizer queue is there to simplify things; it would be more complex without it. When the GC runs, no managed code must be executed - else all analysis that the GC had made might be void if user code runs in the middle.
So when the GC runs, finalization must be deferred, instead of getting executed right away. Running it in a separate thread minimizes the time that the VM requires exclusive access to all threads, and increases the potential for concurrent activities.

Related

Does dart reuse memory for previously used instances?

It is hard to find a good heading for this, but i think my problem comes clear if i post a small code snipped:
SomeObject instance = SomeObject(importantParameter);
// -> "instance" is now a reference to the instance somewhere in RAM
instance = SomeObject(anotherImportantParameter);
// -> "instance" is now a reference to a different instance somewhere in RAM
My question is now, is the used RAM that was allocated at the first construction reused at the second construction? Or is the RAM of the first instance marked as unused for the garbage collector and the second construction is done with a completely new instance with a different portion of RAM?
If the first is true, what with this:
while(true) {
final SomeObject instance = SomeObject(importantParameter);
}
Will then, each time the while is repeated, the RAM be reused?
It's unspecified. The answer is a resounding "maybe".
The language specification never says what happens to unreachable objects, since it's unobservable to the program. (That's what being unreachable means).
In practice, the native Dart implementation uses a generational garbage collector.
The default behavior would be to allocate a new object in "new-space" and overwrite the reference to the previous object. That makes the previous object unreachable (as long as you haven't store other references to it), and it can therefore be garbage collected. If you really go through objects quickly, that will be cheap, since the unreachable object is completely ignored on the next new-space garbage collection.
Allocating a lot of short-lived objects still has an overhead since it causes new-space GC to happen more often, even if the individual objects don't themselves cost anything.
There is also a number of optimization that may change this behavior.
If your object is sufficiently simple and the compiler can see that no reference to it ever escapes, or is used in an identical check, or ... any other number of relevant restrictions, then it might "allocation sink" the object. That means it never actually allocates the object, it just stores the contents somewhere, perhaps even on the stack, and it also inlines the methods so they refer to the data directly instead of going through a this pointer.
In that case, your code may actually reuse the memory of the previous object, because the compiler recognizes that it can.
Do not try to predict whether an optimization like this happens. The requirements can change at any time. Just write code that is correct and not unnecessarily complex, then the compiler will do its best to optimize in all the ways that it can.

Is a block in Objective-C always guaranteed to capture a variable?

Are there any conditions in Objective-C (Objective-C++) where the compiler can detect that a variable capture in a block is never used and thus decide to not capture the variable in the first place?
For example, assume you have an NSArray that contains a large number of items which might take a long time to deallocate. You need to access the NSArray on the main thread, but once you're done with it, you're willing to deallocate it on a background queue. The background block only needs to capture the array and then immediately deallocate. It doesn't actually have to do anything with it. Can the compiler detect this and, "erroneously", skip the block capture altogether?
Example:
// On the main thread...
NSArray *outgoingRecords = self.records;
self.records = incomingRecords;
dispatch_async(background_queue, ^{
(void)outgoingRecords;
// After this do-nothing block exits, then outgoingRecords
// should be deallocated on this background_queue.
});
Am I guaranteed that outgoingRecords will always be captured in that block and that it will always be deallocated on the background_queue?
Edit #1
I'll add a bit more context to better illustrate my issue:
I have an Objective-C++ class that contains a very large std::vector of immutable records. This could easily be 1+ million records. They are basic structs in a vector and accessed on the main thread to populate a table view. On a background thread, a different set of database records might be read into a separate vector, which could also be quite large.
Once the background read has occurred, I jump over to the main thread to swap Objective-C objects and repopulate the table.
At that point, I don't care at all about the contents of the older vector or its parent Objective-C class. There's no fancy destructors or object-graph to teardown, but deallocating hundreds of megabytes, maybe even gigabytes of memory is not instantaneous. So I'm willing to punt it off to a background_queue and have the memory deallocation occur there. In my tests, that appears to work fine and gives me a little bit more time on the main thread to do other stuff before 16ms elapses.
I'm trying to understand if I can get away with simply capturing the object in an "empty" block or if I should do some sort of no-op operation (like call count) so that the compiler cannot optimize it away somehow.
Edit #2
(I originally tried to keep the question as simple as possible, but it seems like it's more nuanced then that. Based on Ken's answer below, I'll add another scenario.)
Here's another scenario that doesn't use dispatch_queues but still uses blocks, which is the part I'm really interested in.
id<MTLCommandBuffer> commandBuffer = ...
// A custom class that manages an MTLTexture that is backed by an IOSurface.
__block MyTextureWrapper *wrapper = ...
// Issue some Metal calls that use the texture inside the wrapper.
// Wait for the buffer to complete, then release the wrapper.
[commandBuffer addCompletedHandler:^(id<MTLCommandBuffer> cb) {
wrapper = nil;
}];
In this scenario, the order of execution is guaranteed by Metal. Unlike the example above, in this scenario performance is not the issue. Rather, the IOSurface that is backing the MTLTexture is being recycled into a CVPixelBufferPool. The IOSurface is being shared between processes and, from what I can tell, MTLTexture does not appear to increase the useCount on the surface. My wrapper class does. When my wrapper class is deallocated, the useCount is decremented and the bufferPool is then free to recycling the IOSurface.
This is all working as expected but I end up with silly code like above just out of uncertainty whether I need to "use" the wrapper instance in the block to ensure it's captured or not. If the wrapper is deallocated before the completion handler runs, then the IOSurface will be recycled and the texture will get overwritten.
Edit to address question edits:
From the Clang Language Specification for Blocks:
Local automatic (stack) variables referenced within the compound
statement of a Block are imported and captured by the Block as const
copies. The capture (binding) is performed at the time of the Block
literal expression evaluation.
The compiler is not required to capture a variable if it can prove
that no references to the variable will actually be evaluated.
Programmers can force a variable to be captured by referencing it in a
statement at the beginning of the Block, like so:
(void) foo;
This matters when capturing the variable has side-effects, as it can
in Objective-C or C++.
(Emphasis added.)
Note that using this technique guarantees that the referenced object lives at least as long as the block, but does not guarantee it will be released with the block, nor by which thread.
There's no guarantee that the block submitted to the background queue will be the last code to hold a strong reference to the array (even ignoring the question of whether the block captures the variable).
First, the block may in fact run before the context which submitted it returns and releases its strong reference. That is, the code which called dispatch_async() could be swapped off the CPU and the block could run first.
But even if the block runs somewhat later than that, a reference to the array may be in an autorelease pool somewhere and not released for some time. Or there may be a strong reference someplace else that will eventually be cleared but not under you explicit control.

Using ARC, is it fatal not to have an autorelease pool for every thread?

I read this:
If you ever create a secondary thread in your application, you need to provide it with its own autorelease pool. Autorelease pools and the objects they contain are discussed further in
in the iOS 5 Developer cookbook.
I'm compiling with ARC. I have been creating many background threads, and it seems that I am doing fine. None of my background threads are long-running. Will all those objects ever be released by say, the main thread's autorelease pool? Or what?
This is what I do to call background thread:
+(void)doBackground:(void (^)())block
{
//DISPATCH_QUEUE_PRIORITY_HIGH
//dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_BACKGROUND,0), ^{
dispatch_async(dispatch_get_global_queue(-2,0), ^{
block();
});
}
Should I change that to
+(void)doBackground:(void (^)())block
{
//DISPATCH_QUEUE_PRIORITY_HIGH
//dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_BACKGROUND,0), ^{
dispatch_async(dispatch_get_global_queue(-2,0), ^{
#autoreleasepool{
block();
}
});
}
Consider it at minimum a programmer error if you do not create an autorelease pool for your new thread. Whether that's fatal to your program is defined by your program's implementation. The classic problem is leaked objects and consequently objects' dealloc which is never executed (could be fatal).
The modern way to create an autorelease pool under ARC is:
void MONThreadsEntry() { // << entry is e.g. a function or method
#autoreleasepool {
...do your work here...
}
}
In more detail, autorelease pools behave as thread-local stacks -- you can push and pop, but there should always be one in place before anything on that thread is autoreleased. Autorelease messages are not transferred from one thread to another.
You may not see issues (e.g. in the console or leaks) if your idea of "creating a thread" is using a higher level asynchronous mechanism, such as using an NSOperationQueue, or if the underlying implementation creates a secondary thread and its own autorelease pool.
Anyways, rather than guessing when to create autorelease pools, just learn where you need to create them and when you should create them. It's all well-defined -- there is no need for guesswork, and there is no need to fear creating them.
Similarly, you will never need to create an autorelease pool for your thread if you are using lower level abstractions, and never autorelease objects on that thread. For example, pthreads and pure C implementations won't need to bother with autorelease pools (unless some API you use assumes they are in place).
Even the main thread in a Cocoa app needs an autorelease pool -- it's just typically not something you write because it exists in the project templates.
Update -- Dispatch Queues
In response to updated question: Yes, you should still create autorelease pools for your programs which run under dispatch queues -- note that with a dispatch queue, you're not creating threads so this is quite a different question from the original question. The reason: Although dispatch queues do manage autorelease pools, no guarantee is made regarding the time/point they are emptied. That is to say, your objects would be released (at some point), but you should also create autorelease pools in this context because the implementation could (in theory) drain the pool every 10,000 blocks it runs, or approximately every day. So in this context, it's really only fatal in scenarios such as when you end up consuming too much memory, or when your programs expects that its objects will be destroyed in some determined fashion -- for example, you could be loading or processing images in the background and wind up consuming a ton of memory if the life of those images is extended unexpectedly because of the autorelease pools. The other example is shared resources or global objects, where you could respond to notifications or introduce race conditions because your "block local" objects may live a lot longer than you expect. Also remember that the implementation/frequency is free to change as its implementors see fit.
Seems now autorelease pool is created for new threads automatically. Don't know when this has changed and why documentation states opposite but that's it.

slow memory release (refcounted structure) - Is my workaround a good way?

in my program i can load a Catalog: ICatalog
a Catalog here contains a lot of refcounted structures (Icollections of IItems, IElements, IRules, etc.)
when I want to change to another catalog,
I load a new Catalog
but the automatic release of the previous ICatalog instance takes time, freezing my application for 2 second or more.
my question is :
I want to defer the release of the old (and no more used) ICatalog instance to another thread.
I've not tested it already, but I intend to create a new thread with :
ErazerThread.OldCatalog := Catalog; // old catalog refcount jumps to 2
Catalog := LoadNewCatalog(...); // old catalog refcount =1
ErazerThread.Execute; //just set OldCatalog to nil.
this way, I expect the release to occur in the thread, and my application not
beeing freezed anymore.
Is it safe (and good practice) ?
Do you have examples of existing code already perfoming release with a similar method ?
I would let such thread block on some threadsafe queue(*), and push the interfaces to release into that queue as iunknowns.
Note however that if the releasing touches a lock that your memory manager uses (like a global heapmanager lock), then this is futile, since your mainthread will block on the first heapmanager access.
With a heapmanager with per thread pools, allocating many items in one thread and releasing it in a different thread might frustrate coalescing and reuse of (small) blocks algorithms.
I still think the way you describe is generally sound when implemented properly. But
this is from a theoretic perspective to show that there might be a link from the 2nd thread to the mainthread via the heapmanager.
(*) Simplest way is to add it to a tthreadlist and use tevent to signal that an element was added.
That looks OK, but don't call the thread's Execute method directly; that will run the thread object's code in the current thread instead of the one that the thread object creates. Call Start or Resume instead.

Does pthread_detach free up the stack allocated to the child thread, after the child thread as exited

I have used pthread_detach inorder to free up the stack allocated to the child thread, but this is not working, I guess it does not free up the memory.....
I don't want to use pthread_join. I know join assures me of freeing up the stack for child, but, I don't want the parent to hang up until the child thread terminates, I want my parent to do some other work in the mean time. So, I have used detach, as it will not block the parent thread.
Please, help me. I have been stuck..
this is not working
Yes it is. You are likely mis-interpreting your observations.
I want my parent to do some other work in the mean time
That's usually the reason to create threads in the first place, and you can do that:
pthread_create(...);
do_some_work(); // both current and new threads work in parallel
pthread_join(...); // wait for both threads to finish
report_results();
I don't want to use pthread_join. I know join assures me of freeing up the stack for child
Above statement is false: it assures no such thing. A common implementation will cache the now available child stack for reuse (in case you'll create another thread shortly).
YES - according to http://cursuri.cs.pub.ro/~apc/2003/resources/pthreads/uguide/users-16.htm it frees memory either when the thread ends or immediately if the thread has already ended...
As you don't provide any clue as how you determine that the memory is not freed up I can only assume that the method you use to determine it is not sufficient...

Resources