I am working through some retain-cycle issues with blocks/ARC, and I am trying to get my head around the nuances. Any guidance is appreciated.
Apple's documentation on "Blocks and Variables" (http://developer.apple.com/library/ios/#documentation/cocoa/Conceptual/Blocks/Articles/bxVariables.html) says the following:
If you use a block within the implementation of a method, the rules
for memory management of object instance variables are more subtle:
If you access an instance variable by reference, self is retained; If
you access an instance variable by value, the variable is retained.
The following examples illustrate the two different situations:
dispatch_async(queue, ^{
// instanceVariable is used by reference, self is retained
doSomethingWithObject(instanceVariable);
});
id localVariable = instanceVariable;
dispatch_async(queue, ^{
// localVariable is used by value, localVariable is retained (not self)
doSomethingWithObject(localVariable);
});
I find this explanation confusing.
Is this appropriate use of the "by value" / "by reference" terminology? Assuming that these variables are of the same type (id), it seems like the distinguishing characteristic between them is their scope.
I do not see how self is being referenced in the "by reference" example? If an accessor method were being used, (e.g. - below), I could see self being retained.
doSomethingWithObject(self.instanceVariable);
Do you have any guidance on when one might want to do things one way or the other?
If conventional wisdom is to utilize the "by value" variables, it seems like this is going to result in a lot of additional code for additional variable declarations?
In a circumstance where nested blocks are coming into play, it seems like it may be more maintainable to avoid declaring the blocks inside of each other as one could ultimately end up with a potpourri of unintentionally retained objects?
Think of the case of using instanceVariable as an equivalent of writing self->instanceVariable. An instance variable is by definition "attached" to the self object and exists while the self object exists.
Using instanceVariable (or self->instanceVariable) means that you start from the address of self, and ask for an instance variable (that is offset of some bytes from the self object original address).
Using localVariable is a variable on its own, that does not rely on self and is not an address that is relative to another object.
As the blocks capture the variables when they are created, you typically prefer using instance variables when you mean "when the block is executed, I want to get the value of the instance variable at the time of the execution" as you will ask the self object for the value of the instance variable at that time (exactly the same way as you would call [self someIVarAccessorMethod]). But then be careful not to create some retain cycles.
On the other hand, if you use localVariable, the local variable (and not self) will be captured when the block is created, so even if the local variable changes after the block creation, the old value will be used inside the block.
// Imagine instanceVariable being an ivar of type NSString
// And property being a #property of type NSString too
instanceVariable = #"ivar-before";
self.property = #"prop-before";
NSString* localVariable = #"locvar-before";
// When creating the block, self will be retained both because the block uses instanceVariable and self.property
// And localVariable will be retained too as it is used directly
dispatch_block_t block = ^{
NSLog(#"instance variable = %#", instanceVariable);
NSLog(#"property = %#", self.property);
NSLog(#"local variable = %#", localVariable);
};
// Modify some values after the block creation but before execution
instanceVariable = #"ivar-after";
self.property = #"prop-after";
localVariable = #"locvar-after";
// Execute the block
block();
In that example the output will show that instanceVariable and self.property are accessed thru the self object, so self was retained but the value of instanceVariable and self.property are queried in the code of the block and they will return their value at the time of execution, respectively "ivar-after" and "prop-after". On the other hand, localVariable was retained at the time the block was created and its value was const-copied at that time, so the last NSLog will show "locvar-before".
self is retained when you use instance variables or properties or call methods on self itself in the code of the block. Local variables are retained when you use them directly in the code of the block.
Note: I suggest you watch the WWDC'11 and WWDC'12 videos that talks about the subject, they are really instructive.
Is this appropriate use of the "by value" / "by reference" terminology? It's at least analogous to the typical use. Copying a value to a local variable is like copying a value onto the stack; using an ivar is like passing a pointer to a value that's stored somewhere else.
I do not see how self is being referenced in the "by reference" example? When you use an instance variable inside a method, the reference to self is implied. Some people actually write self->foo instead of foo to access an ivar just to remind themselves that foo is an ivar. I don't recommend that, but the point is that foo and self->foo mean the same thing. If you access an ivar inside a block, self will be retained in order to ensure that the ivar is preserved for the duration of the block.
Do you have any guidance on when one might want to do things one way or the other? It's useful to think in terms of the pass by reference/pass by value distinction here. As AliSoftware explained local variables are preserved when the block is created, just as parameters passed by value are copied when a function is called. An ivar is accessed through self just as a parameter passed by reference is accessed through a pointer, so its value isn't determined until you actually use it.
it seems like this is going to result in a lot of additional code for additional variable declarations? Blocks have been a feature of the language for a while now, and I haven't noticed that this is a problem. More often, you want the opposite behavior: a variable declared locally that you can modify within a block (or several blocks). The __block storage type makes that possible.
it seems like it may be more maintainable to avoid declaring the blocks inside of each other as one could ultimately end up with a potpourri of unintentionally retained objects? There's nothing wrong with letting one or more blocks retain an object for as long as they need it -- the object will be released just as soon as the blocks that use it terminate. This fits perfectly with the usual Objective-c manual memory management philosophy, where every object worries only about balancing its own retains. A better reason to avoid several layers of nested blocks is that that sort of code may be more difficult to understand than it needs to be.
Related
In Objective-C (as of Xcode 7), the __block modifier for use with primitives is clearly explained in Stack Overflow such as here and by Apple here in the Blocks Programming Guide. Without the label, a copy of the primitive is captured and the original cannot be modified from within the block. With the label, no copy, and the original can be modified from within the block.
But for use with pointers to objects, the situation is not so well explained. Here Apple says either “a strong reference is made to self” or “a strong reference is made to the variable”. But then at the end of the page Apple says:
To override this behavior for a particular object variable, you can mark it with the __block storage type modifier.
What does “override this behavior” mean?
Further complicating things is that some posts on Stack Overflow talking about making a weak reference when calling an object from within a block.
I am not trying to establish an object. I just want to modify an existing object’s state from within a block. I believe the block is being called synchronously, within the same thread.
Imagine:
Clicker* myClicker = [[Clicker alloc] init] ;
…
// Declare a block.
void (^myBlock)( );
// Populate the block.
myBlock = ^ void ( ) {
[myClicker click] ; // Updating some state in that object, such as incrementing a counter number or adding an element to a collection.
};
// Call `myBlock` synchronously (same thread) from some other code.
… // … invokes `myBlock` repeatedly …
My questions:
How should that code be modified with __block modifier?
How should that code be modified with weak references?
What other issues apply to an object’s state modified from within a block?
First of all, the basic point of __block is the same for all types of variables (primitive variables, object-pointer variables, and all other variables) -- __block makes the variable shared between the outside scope and the scopes of the blocks that capture it, so that an assignment (=) to the variable in one scope is seen in all other scopes.
So if you don't use __block, if you assign to an object-pointer variable to point to another object outside the block after the block is created, the block won't see the assignment to the variable, and will still see it pointing to the object it was pointing to when the block was created. Conversely, inside the block you won't be able to assign to the variable. If you do use __block, then assignments to the variable to point to another object, either inside or outside the block, will be reflected in the other scopes.
Note that mutation of objects' state has nothing to do with assignment to variables. You mutate objects' state by calling a mutating method on a pointer to the object, or you can alter a field directly using a pointer to the object using the -> syntax. Neither of these involve assigning to a variable holding the pointer to the object. On the other hand, assigning to a variable holding a pointer to an object only makes the pointer point to another object; it does not mutate any objects.
The part you are reading about strong references and "override this behavior" has to do with the separate issue of memory management behavior of blocks under MRC only. Under MRC, there is no concept of variables being __strong or __weak like in ARC. When a variable of object-pointer type is captured by a block in MRC, it is by default captured as a strong reference (the block retains it, and releases it when the block is deallocated), because that is the desired behavior most of the time. If you wanted the block to not retain the captured variable, the only way to do that was to make it __block, which not only made the variable shared between the two scopes, but also made the block not retain the variable. So the two concepts were conflated in MRC.
In ARC, whether a block captures a variable strongly or weakly depends on the captured variable being __strong or __weak (or __unsafe_unretained), and is completely orthogonal with whether the variable is __block or not. You can have object-pointer variables that are __block without being weakly captured, or weakly captured without being __block, or both weakly captured and __block if you want.
You quote the somewhat dated Blocks Programming Topics, which says:
To override this behavior for a particular object variable, you can mark it with the __block storage type modifier.
That document dates back to the days of manual reference counting, back before ARC. In manual reference counting code you could use __block to avoid establishing a strong reference to objects referenced inside the block.
But this behavior has changed with ARC, as outlined in Transitioning to ARC Release Notes. We now use weak in those cases where we don't want to establish strong references to the objects referenced in the block. (You can use unsafe_unretained in special cases where you know the resulting dangling pointer isn't a problem.)
So, go ahead and use __block when dealing with fundamental types that you want to mutate inside the block. But when dealing with objects in blocks under ARC, the __block qualifier generally doesn't enter the discussion. The question is simply whether you want a strong reference or a weak one (or an unsafe, unretained one). And that's largely dictated by the object graph of your app, namely (a) whether you need weak/unretained reference to prevent strong reference cycle; or (b) you don't want want some asynchronously executing block to unnecessarily prolong the life of some object referenced in the block. Neither of those situations would appear to be the case here.
I have been using blocks and familiar with the memory management when using self inside blocks in
Non-ARC environment
But I have two specific questions:
1) I understand I can use __block to avoid the retain cycle a retained block which in turn using self can create, like below:
__block MyClass *blockSelf = self;
self.myBlock = ^{
blockSelf.someProperty = abc;
[blockSelf someMethod];
};
This will avoid the retain cycle for sure but I by doing this I have created a scope for self to be released and eventually deallocated by someone else. So when this happen self is gone and blockSelf is pointing to a garbage value. There can be conditions when the block is executed after the self is deallocated, then the block will crash as it is trying to use deallocated instance. How do we can avoid this condition? How do I check if blockSelf is valid when block executes or stop block from executing when self is deallocated.
2) On the similar lines suppose I use block like below:
__block MyClass *blockSelf = self;
self.myBlock = ^{
[blockSelf someMethod:blockSelf.someProperty];
};
// I am taking someProperty in an method argument
-(void) someMethod:(datatype*)myClassProperty
{
myClassProperty = abc;
}
Now there can be situations where self is not released but someProperty is released before someMethod's execution starts (This could happen when there are multiple threads). Even if I do self.someProperty = nil; when it is release, myClassProperty is not nil and pointing to some garbage, hence when someMethod is executed the first line will lead to crash. How do I avoid this?
This is the same issue as non-zeroing weak references everywhere else in non-ARC code, e.g. delegates etc. (MRC doesn't have zeroing weak references; so these are the only kind of weak reference in MRC. Yet people were still able to write safe code in the pre-ARC days.)
Basically, the solution is that you need a clear map of ownership. Either self is responsible for keeping the block alive; or some other object is responsible for keeping the block alive.
For example, with delegates, usually, a "parent" object is the delegate of a "child" object; in this case the "parent" object is responsible for keeping the "child" object alive, so the "parent" object will outlive the "child" object, thus the back reference can be weak and is safe (because the child object's methods could only possibly be called by the parent object while the parent object is alive).
On the other hand, if you have an asynchronous operation, and the block is given to the operation as a callback, then usually the operation is responsible for holding onto the block. In this case, self would not hold onto the block. And the block would hold a strong reference to self, so that when the operation is done, it can still safely do whatever it is that it needs to do on self. In fact, whatever object self is, it doesn't even need to be retained by whoever uses it, since it is indirectly retained by the asynchronous operation -- it can just be a create, fire, and forget kind of thing.
If you have something where the block is sometimes kept alive by self, and sometimes kept alive by something else, then you should re-think your design. You say "There can be conditions when the block is executed after the self is deallocated"; well, you should describe your whole design and how this can happen. Because usually, for something that is executed asynchronously, self need not hold onto the block.
Your code is really confusing and doesn't make sense. For example, why would you assign to a parameter (myClassProperty)? What's the point of passing an argument when the parameter is going to be overwritten anyway? Why would you name a local variable "myClassProperty"?
What I think you are asking about is accessing a property that can be changed on different threads, and how to deal with the memory management of that. (This question is unrelated to blocks or ARC/MRC. It is equally an issue in ARC.)
The answer is that you need an atomic property. You can make a synchronized property atomic, or implement an atomic property manually if you know how. What an atomic property of object pointer type needs to do is its getter needs to not just return the underlying variable, but also retain and autorelease it, and return the result of that. The retrieval and retaining in the getter occurs in a critical section that is synchronized with a critical section in the setter that includes the release of the old value and retaining of the new. Basically, this synchronization guarantees that the value will not be released in the setter in between retrieving the value and retaining it in the getter. And the value returned from the getter is retained and autoreleased, so it is guaranteed to be alive for the duration of the scope.
Solution for 2)
__block MyClass *blockSelf = self;
self.myBlock = ^{
datatype* p = [blockSelf.someProperty retain];
[blockSelf someMethod:p];
[p release];
};
// I am taking someProperty in an method argument
-(void) someMethod:(datatype*)myClassProperty
{
myClassProperty = abc;
}
For 1) don't know how you use myBlock. If it's used only by self, then all will be fine. And if it's used by some other object, then it's also should have retained reference at self and then there is also all will be fine.
I am having issues in code when calling dispatch_async. I think that issue is due to ARC reclaiming the object, before it is used in a block, as the method that dispatches it finishes.
- (void) method:(SomeClass *) someClass {
// local variable
NSNumber *someValue = someClass.somePropertyOnManagedObject;
dispatch_async(queue, ^() {
/* call some singleton object passing variable
* when access the variable, reference is nil
*/
[[DashboardFacade sharedInstance] someMethod:someValue];
});
}
After having looking through much documentation, I conclude
Block accesses no parameters – nothing to discuss
Block accesses simple type parameters e.g. BOOL, int - these are copied and not a problem
Block accesses parameter of method that dispatched it - I am not sure, but think
that this is ok
Block accesses property of self – as long as self “lives” until the call has finished ok
Block accesses local variable in method that dispatched it
If we use some semaphores such that we wait for the block to return before leaving the method, then all ok
Otherwise variable may have been garbage collected before block can use.
I think that the solution is to use __block modifier such that ARC retains the variable.
My question is
Is the above technically correct, e.g. using __block will resolve the problem and not introduce other problems?
Why can't I find this anywhere on the internet/google?
Is the above technically correct, e.g. using __block will resolve the
problem and not introduce other problems?
Yes it's technically correct, __block, on ARC, allows you to change the variable (in this case to where its pointing, since it's a NSNumber *), that was declared outside the block context.
Why can't I find this anywhere on the internet/google?
I think the problem is not related to this particular place of your code, but something else.
someValue inside the block can be nil only if it is nil after this assignment:
NSNumber *someValue = someClass.somePropertyOnManagedObject;
Blocks retain objects that are being captured by the blocks as a result of those variables being used inside the blocks. And it doesn’t matter if it is self or a local variable from the scope that is visible to the block.
I'm currently having trouble understanding the fundamentals of Obj-C blocks and the __block storage type. From the following documentation:
http://developer.apple.com/library/ios/#documentation/cocoa/Conceptual/Blocks/Articles/bxVariables.html#//apple_ref/doc/uid/TP40007502-CH6-SW6
I'm trying to understand the following paragraph and examples:
When a block is copied, it creates strong references to object variables used within the block. If you use a block within the implementation of a method:
If you access an instance variable by reference, a strong reference is made to self;
If you access an instance variable by value, a strong reference is made to the variable.
The following examples illustrate the two different situations:
dispatch_async(queue, ^{
// instanceVariable is used by reference, a strong reference is made to self
doSomethingWithObject(instanceVariable);
});
id localVariable = instanceVariable;
dispatch_async(queue, ^{
/*
localVariable is used by value, a strong reference is made to localVariable
(and not to self).
*/
doSomethingWithObject(localVariable);
});
To override this behavior for a particular object variable, you can mark it with the __block storage type modifier.
My questions:
How exactly is one example "accessed by reference" while the other one is accessed by variable? Why is localVariable "used by value?"
What does the doc mean by "a strong reference is made to self"? Which "self" is it referring to?
If I add the __block storage type to localVariable in the second example, am I wrong to assume that the block closes over the variable, so it keeps it around in heap until block is released? What other things are happening?
Thank you!
How exactly is one example "accessed by reference" while the other one is accessed by variable? Why is localVariable "used by value?"
One way to understand this is the following:
when you use a local variable in a block defined in the method, what happens is that the content of the variable is copied in some block private memory, so that it is available when the block is executed (after the method exits); in this sense we may talk of accessed "by value" (as in: the value is copied); syntactically the compiler doesn't know what the content of localVariable refers to, so its values is treated as such;
when directly accessing instanceVariable in a block defined within a method of a class, the compiler knows that we are accessing the same object that is executing the method and there is not need to copy anything, since the object has a longer lifetime than the method where the block is found; but we need to ensure that the object is still there when the block is executed, so we get a strong reference to it.
Now, as to the use of "by reference": in the first case, you get a copy of the reference to the class member: if you could change its value (but you can't, because the compiler forbids it), you were just modifying a block private copy, so the original object is not affected.
In the second case, you can modify the value of instanceVariable (like in: it was nil and you allocate an object referred through it) and this will affect the object that executed the method were the block is defined.
What does the doc mean by "a strong reference is made to self"? Which "self" is it referring to?
self is the object which is currently executing the method where the block is found. that a strong reference is made only means (in ARC parlance) that the retain count of the object is incremented (to make sure some other entity cannot cause it to be deallocated by releasing it).
If I add the __block storage type to localVariable in the second example, am I wrong to assume that the block closes over the variable, so it keeps it around in heap until block is released? What other things are happening?
Using __block makes the variable be always accessed "by reference", so you can modify them.
This is how they are handled:
__block variables live in storage that is shared between the lexical scope of the variable and all blocks and block copies declared or created within the variable’s lexical scope. Thus, the storage will survive the destruction of the stack frame if any copies of the blocks declared within the frame survive beyond the end of the frame (for example, by being enqueued somewhere for later execution). Multiple blocks in a given lexical scope can simultaneously use a shared variable.
As an optimization, block storage starts out on the stack—just like blocks themselves do. If the block is copied using Block_copy (or in Objective-C when the block is sent a copy), variables are copied to the heap. Thus, the address of a __block variable can change over time.
I understand that the call to CLGeocoder geocodeAddressString is asynchronous, with a block passed in to handle the callback at completion time. I also understand that the only variables that are mutable within the block are local __block variables. But I need to store the resulting CLPlacemarks in a global NSMutableArray variable, and I can't work out how that can happen. Any __block variables can only be accessed within the same method, but at the point that they have any value (i.e. within the block) I can't assign any value to the global array. After the block, the __block variables will likely not have any value because of the immediate return when calling the async geocodeAddressString.
How can I store the results of the call, so that they can be used to update a UITableView?
Ok, solved. I was incorrectly deducing (from my reading about blocks) that properties and instance variables could not be modified within the scope of a block. In my case, the reason I couldn't modify my iVar array was that I hadn't alloc/init'ed it. Once I did that, I was able to assign values and see them outside of the block e.g. in my cellForRowAtIndexPath method of my table view.
Apologies to anyone who may have been misled by my initial assumptions! For the record, I do think that the text on the Apple docs re blocks & variables here is a bit misleading...the only variables that it mentions as being mutable are the __block function-level variables, whereas it says all others "can be accessed" which I took as meaning "read-only" (I know that both getters and setters are accessors, but the context was confusing).