Objective-C cpu cache behavior - ios

Apple provides some documentation about synchronizing variables and even order of execution. What I don't see is any documentation on CPU cache behavior. What guarantees and control does the Objective-C developer have to ensure cache coherence between threads?
Consider the following where a variable is set on a background thread but read on the main thread:
self.count = 0;
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0), ^ {
self.count = 5;
dispatch_async(dispatch_get_main_queue(), ^{
NSLog(#"%i", self.count);
});
}
Should count be volatile in this case?
Update 1
The documentation in Inter-thread Communication guarantees that shared variables can be used for inter-thread communication.
Another simple way to communicate information between two threads is to use a global variable, shared object, or shared block of memory.
Does this imply volatile is not required in this case? This is conflicting with the documentation in Memory Barriers and Volatile Variables:
If the variable is visible from another thread however, such an optimization might prevent the other thread from noticing any changes to it. Applying the volatile keyword to a variable forces the compiler to load that variable from memory each time it is used.
So I still don't know whether volatile is required because the compiler could use register caching optimizations or if it's not required because the compiler somehow knows it's a "shared" something.
The documentation is not very clear about what a shared variable is or how the compiler knows about it. In the above example, is count a shared object? Let's say count is an int, then it's not an object. Is it a shared block of memory or does that only apply to __block declared variables? Maybe volatile is required for non-block, non-object, non-global, shared variables.
Update 2
To everyone thinking this is a question about synchronization, it's not. This is about CPU cache behavior on the iOS platform.

I know you are probably asking about the general case of using variables across threads (in which case the rules about using volatile and locks are the same for ObjC as it is for normal C). However, for the example code you posted the rules are a little different. (I'll be skipping over and simplifying things and using Xcode to mean both Xcode and the compiler)
self.count = 0;
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0), ^ {
self.count = 5;
dispatch_async(dispatch_get_main_queue(), ^{
NSLog(#"%i", self.count);
});
}
I'm going to assume self is an NSObject subclass something like this:
#interface MyClass : NSObject {
NSInteger something;
}
#property (nonatomic, assign) NSInteger count;
#end
Objective C is a superset of C, and if you've ever done any reverse engineering of ObjC you'll know that ObjC code (sort of, not quite) gets converted into C code before it's compiled. All [self method:object] calls get converted to objc_msgSend(self, "method:", object) calls and self is a C struct with ivars and other runtime info in it.
This means this code doesn't do quite what you might expect.
-(void)doThing{
NSInteger results = something + self.count;
}
Just accessing something isn't just accessing the variable but is instead doing self->something (which is why you need to get a weak reference to self when accessing an ivar in an Objective C block to avoid a retain cycle).
The second point is Objective C properties don't really exist. self.count gets turned into [self count] and self.count = 5 gets turned into [self setCount:5]. Objective C properties are just syntax sugar; convenience save you some typing and make things look a bit nicer.
If you've been using Objective C for more than a few years ago you'll remember when you had to add #synthesize propertyName = _ivarName to the #implementation for ObjC properties you declared in the header. (now Xcode does it automatically for you)
#synthesize was a trigger for Xcode to generate the setter and getter methods for you. (if you hadn't written #synthesize Xcode expected you to write the setter and getter yourself)
// Auto generated code you never see unless you reverse engineer the compiled binary
-(void)setCount:(NSInteger)count{
_count = count;
}
-(NSInteger)count{
return _count;
}
If you are worried about threading issues with self.count you are worried about 2 threads calling these methods at once (not directly accessing the same variable at once, as self.count is actually a method call not a variable).
The property definition in the header changes what code is generated (unless you implement the setter yourself).
#property (nonatomic, retain)
[_count release];
[count retain];
_count = count;
#property (nonatomic, copy)
[_count release];
_count = [count copy];
#property (nonatomic, assign)
_count = count;
TLDR
If you care about threading and want to make sure you don't read the value half way through a write happening on another thread then change nonatomic to atomic (or get rid of nonatomic as atomic is the default). Which will result in code generated something like this.
#property (atomic, assign) NSInteger count;
// setter
#synchronized(self) {
_count = count;
}
This won't guarantee your code is thread safe, but (as long as you only access the property view it's setter and getter) should mean you avoid the possibility of reading the value during a write on another thread. More info about atomic and nonatmoic in the answers to this question.

The simplest way, and the way that is least challenging to the developer's brain, is to perform tasks on a serial dispatch queue. A serial dispatch queue, like the main queue, is a tiny single threaded island in a multi-threaded world.

You should use lock or some other synchronise mechanism to protect the shared variable. According to the documentation it said:
Another simple way to communicate information between two threads is to use a global variable, shared object, or shared block of memory. Although shared variables are fast and simple, they are also more fragile than direct messaging. Shared variables must be carefully protected with locks or other synchronization mechanisms to ensure the correctness of your code. Failure to do so could lead to race conditions, corrupted data, or crashes.
In fact the best way to protect the counter variable is to use the Atomic Operation. You can read the article: https://www.mikeash.com/pyblog/friday-qa-2011-03-04-a-tour-of-osatomic.html

Related

When to use the autorelease pool [duplicate]

For the most part with ARC (Automatic Reference Counting), we don't need to think about memory management at all with Objective-C objects. It is not permitted to create NSAutoreleasePools anymore, however there is a new syntax:
#autoreleasepool {
…
}
My question is, why would I ever need this when I'm not supposed to be manually releasing/autoreleasing ?
EDIT: To sum up what I got out of all the anwers and comments succinctly:
New Syntax:
#autoreleasepool { … } is new syntax for
NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
…
[pool drain];
More importantly:
ARC uses autorelease as well as release.
It needs an auto release pool in place to do so.
ARC doesn't create the auto release pool for you. However:
The main thread of every Cocoa app already has an autorelease pool in it.
There are two occasions when you might want to make use of #autoreleasepool:
When you are in a secondary thread and there is no auto release pool, you must make your own to prevent leaks, such as myRunLoop(…) { #autoreleasepool { … } return success; }.
When you wish to create a more local pool, as #mattjgalloway has shown in his answer.
ARC doesn't get rid of retains, releases and autoreleases, it just adds in the required ones for you. So there are still calls to retain, there are still calls to release, there are still calls to autorelease and there are still auto release pools.
One of the other changes they made with the new Clang 3.0 compiler and ARC is that they replaced NSAutoReleasePool with the #autoreleasepool compiler directive. NSAutoReleasePool was always a bit of a special "object" anyway and they made it so that the syntax of using one is not confused with an object so that it's generally a bit more simple.
So basically, you need #autoreleasepool because there are still auto release pools to worry about. You just don't need to worry about adding in autorelease calls.
An example of using an auto release pool:
- (void)useALoadOfNumbers {
for (int j = 0; j < 10000; ++j) {
#autoreleasepool {
for (int i = 0; i < 10000; ++i) {
NSNumber *number = [NSNumber numberWithInt:(i+j)];
NSLog(#"number = %p", number);
}
}
}
}
A hugely contrived example, sure, but if you didn't have the #autoreleasepool inside the outer for-loop then you'd be releasing 100000000 objects later on rather than 10000 each time round the outer for-loop.
Update:
Also see this answer - https://stackoverflow.com/a/7950636/1068248 - for why #autoreleasepool is nothing to do with ARC.
Update:
I took a look into the internals of what's going on here and wrote it up on my blog. If you take a look there then you will see exactly what ARC is doing and how the new style #autoreleasepool and how it introduces a scope is used by the compiler to infer information about what retains, releases & autoreleases are required.
#autoreleasepool doesn't autorelease anything. It creates an autorelease pool, so that when the end of block is reached, any objects that were autoreleased by ARC while the block was active will be sent release messages. Apple's Advanced Memory Management Programming Guide explains it thus:
At the end of the autorelease pool block, objects that received an autorelease message within the block are sent a release message—an object receives a release message for each time it was sent an autorelease message within the block.
People often misunderstand ARC for some kind of garbage collection or the like. The truth is that, after some time people at Apple (thanks to llvm and clang projects) realized that Objective-C's memory administration (all the retains and releases, etc.) can be fully automatized at compile time. This is, just by reading the code, even before it is run! :)
In order to do so there is only one condition: We MUST follow the rules, otherwise the compiler would not be able to automate the process at compile time. So, to ensure that we never break the rules, we are not allowed to explicitly write release, retain, etc. Those calls are Automatically injected into our code by the compiler. Hence internally we still have autoreleases, retain, release, etc. It is just we don't need to write them anymore.
The A of ARC is automatic at compile time, which is much better than at run time like garbage collection.
We still have #autoreleasepool{...} because having it does not break any of the rules, we are free create/drain our pool anytime we need it :).
Autorelease pools are required for returning newly created objects from a method. E.g. consider this piece of code:
- (NSString *)messageOfTheDay {
return [[NSString alloc] initWithFormat:#"Hello %#!", self.username];
}
The string created in the method will have a retain count of one. Now who shall balance that retain count with a release?
The method itself? Not possible, it has to return the created object, so it must not release it prior to returning.
The caller of the method? The caller does not expect to retrieve an object that needs releasing, the method name does not imply that a new object is created, it only says that an object is returned and this returned object may be a new one requiring a release but it may as well be an existing one that doesn't. What the method does return may even depend on some internal state, so the the caller cannot know if it has to release that object and it shouldn't have to care.
If the caller had to always release all returned object by convention, then every object not newly created would always have to be retained prior to returning it from a method and it would have to be released by the caller once it goes out of scope, unless it is returned again. This would be highly inefficient in many cases as one can completely avoid altering retain counts in many cases if the caller will not always release the returned object.
That's why there are autorelease pools, so the first method will in fact become
- (NSString *)messageOfTheDay {
NSString * res = [[NSString alloc] initWithFormat:#"Hello %#!", self.username];
return [res autorelease];
}
Calling autorelease on an object adds it to the autorelease pool, but what does that really mean, adding an object to the autorelease pool? Well, it means telling your system "I want you to to release that object for me but at some later time, not now; it has a retain count that needs to be balanced by a release otherwise memory will leak but I cannot do that myself right now, as I need the object to stay alive beyond my current scope and my caller won't do it for me either, it has no knowledge that this needs to be done. So add it to your pool and once you clean up that pool, also clean up my object for me."
With ARC the compiler decides for you when to retain an object, when to release an object and when to add it to an autorelease pool but it still requires the presence of autorelease pools to be able to return newly created objects from methods without leaking memory. Apple has just made some nifty optimizations to the generated code which will sometimes eliminate autorelease pools during runtime. These optimizations require that both, the caller and the callee are using ARC (remember mixing ARC and non-ARC is legal and also officially supported) and if that is actually the case can only be known at runtime.
Consider this ARC Code:
// Callee
- (SomeObject *)getSomeObject {
return [[SomeObject alloc] init];
}
// Caller
SomeObject * obj = [self getSomeObject];
[obj doStuff];
The code that the system generates, can either behave like the following code (that is the safe version that allows you to freely mix ARC and non-ARC code):
// Callee
- (SomeObject *)getSomeObject {
return [[[SomeObject alloc] init] autorelease];
}
// Caller
SomeObject * obj = [[self getSomeObject] retain];
[obj doStuff];
[obj release];
(Note the retain/release in the caller is just a defensive safety retain, it's not strictly required, the code would be perfectly correct without it)
Or it can behave like this code, in case that both are detected to use ARC at runtime:
// Callee
- (SomeObject *)getSomeObject {
return [[SomeObject alloc] init];
}
// Caller
SomeObject * obj = [self getSomeObject];
[obj doStuff];
[obj release];
As you can see, Apple eliminates the atuorelease, thus also the delayed object release when the pool is destroyed, as well as the safety retain. To learn more about how that is possible and what's really going on behind the scenes, check out this blog post.
Now to the actual question: Why would one use #autoreleasepool?
For most developers, there's only one reason left today for using this construct in their code and that is to keep the memory footprint small where applicable. E.g. consider this loop:
for (int i = 0; i < 1000000; i++) {
// ... code ...
TempObject * to = [TempObject tempObjectForData:...];
// ... do something with to ...
}
Assume that every call to tempObjectForData may create a new TempObject that is returned autorelease. The for-loop will create one million of these temp objects which are all collected in the current autoreleasepool and only once that pool is destroyed, all the temp objects are destroyed as well. Until that happens, you have one million of these temp objects in memory.
If you write the code like this instead:
for (int i = 0; i < 1000000; i++) #autoreleasepool {
// ... code ...
TempObject * to = [TempObject tempObjectForData:...];
// ... do something with to ...
}
Then a new pool is created every time the for-loop runs and is destroyed at the end of each loop iteration. That way at most one temp object is hanging around in memory at any time despite the loop running one million times.
In the past you often had to also manage autoreleasepools yourself when managing threads (e.g. using NSThread) as only the main thread automatically has an autorelease pool for a Cocoa/UIKit app. Yet this is pretty much legacy today as today you probably wouldn't use threads to begin with. You'd use GCD DispatchQueue's or NSOperationQueue's and these two both do manage a top level autorelease pool for you, created before running a block/task and destroyed once done with it.
It's because you still need to provide the compiler with hints about when it is safe for autoreleased objects to go out of scope.
Quoted from https://developer.apple.com/library/mac/documentation/Cocoa/Conceptual/MemoryMgmt/Articles/mmAutoreleasePools.html:
Autorelease Pool Blocks and Threads
Each thread in a Cocoa application maintains its own stack of
autorelease pool blocks. If you are writing a Foundation-only program
or if you detach a thread, you need to create your own autorelease
pool block.
If your application or thread is long-lived and potentially generates
a lot of autoreleased objects, you should use autorelease pool blocks
(like AppKit and UIKit do on the main thread); otherwise, autoreleased
objects accumulate and your memory footprint grows. If your detached
thread does not make Cocoa calls, you do not need to use an
autorelease pool block.
Note: If you create secondary threads using the POSIX thread APIs
instead of NSThread, you cannot use Cocoa unless Cocoa is in
multithreading mode. Cocoa enters multithreading mode only after
detaching its first NSThread object. To use Cocoa on secondary POSIX
threads, your application must first detach at least one NSThread
object, which can immediately exit. You can test whether Cocoa is in
multithreading mode with the NSThread class method isMultiThreaded.
...
In Automatic Reference Counting, or ARC, the system uses the same
reference counting system as MRR, but it insertsthe appropriate memory
management method callsfor you at compile-time. You are strongly
encouraged to use ARC for new projects. If you use ARC, there is
typically no need to understand the underlying implementation
described in this document, although it may in some situations be
helpful. For more about ARC, see Transitioning to ARC Release Notes.
TL;DR
Why is #autoreleasepool still needed with ARC?
#autoreleasepool is used by Objective-C and Swift to work with autorelese inside
When you work with pure Swift and allocate Swift objects - ARC handles it
But if you decide call/use Foundation/Legacy Objective-C code(NSData, Data) which uses autorelese inside then #autoreleasepool in a rescue
//Swift
let imageData = try! Data(contentsOf: url)
//Data init uses Objective-C code with [NSData dataWithContentsOfURL] which uses `autorelese`
Long answer
MRC, ARC, GC
Manual Reference Counting(MRC) or Manual Retain-Release(MRR) as a developer you are responsible for counting references on objects manually
Automatic Reference Counting(ARC) was introduced in iOS v5.0 and OS X Mountain Lion with xCode v4.2
Garbage Collection(GC) was available for Mac OS and was deprecated in OS X Mountain Lion. Must Move to ARC
Reference count in MRC and ARC
//MRC
NSLog(#"Retain Count: %d", [variable retainCount]);
//ARC
NSLog(#"Retain Count: %ld", CFGetRetainCount((__bridge CFTypeRef) variable));
Every object in heap has an integer value which indicates how many references are pointed out on it. When it equals to 0 object is deallocated by system
Allocating object
Working with Reference count
Deallocating object. deinit is called when retainCount == 0
MRC
A *a1 = [[A alloc] init]; //this A object retainCount = 1
A *a2 = a1;
[a2 retain]; //this A object retainCount = 2
// a1, a2 -> object in heap with retainCount
Correct way to release an object:
release If only this - dangling pointer. Because it still can point on the object in heap and it is possible to send a message
= nil If only this - memory leak. deinit will not be called
A *a = [[A alloc] init]; //++retainCount = 1
[a release]; //--retainCount = 0
a = nil; //guarantees that even somebody else has a reference to the object, and we try to send some message thought variable `a` this message will be just skipped
Working with Reference count(Object owner rules):
(0 -> 1) alloc, new, copy, mutableCopy
(+1) retain You are able to own an object as many times as you need(you can call retain several times)
(-1) release If you an owner you must release it. If you release more than retainCount it will be 0
(-1) autorelease Adds an object, which should be released, to autorelease pool. This pool will be processed at the end of RunLoop iteration cycle(it means when all tasks will be finished on the stack)[About] and after that release will be applied for all objects in the pool
(-1) #autoreleasepool Forces process an autorelease pool at the end of block. It is used when you deal with autorelease in a loop and want to clear resources ASAP. If you don't do it your memory footprint will be constantly increasing
autorelease is used in method calls when you allocate a new object there and return it
- (B *)foo {
B *b1 = [[B alloc] init]; //retainCount = 1
//fix - correct way - add it to fix wrong way
//[b1 autorelease];
//wrong way(without fix)
return b;
}
- (void)testFoo {
B *b2 = [a foo];
[b2 retain]; //retainCount = 2
//some logic
[b2 release]; //retainCount = 1
//Memory Leak
}
#autoreleasepool example
- (void)testFoo {
for(i=0; i<100; i++) {
B *b2 = [a foo];
//process b2
}
}
ARC
One of biggest advantage of ARC is that it automatically insert retain, release, autorelease under the hood in Compile Time and as developer you should not take care of it anymore
Enable/Disable ARC
//enable
-fobjc-arc
//disable
-fno-objc-arc
Variants from more to less priority
//1. local file - most priority
Build Phases -> Compile Sources -> Compiler Flags(Select files -> Enter)
//2. global
Build Settings -> Other C Flags(OTHER_CFLAGS)
//3. global
Build Settings -> Objective-C Automatic Reference Counting(CLANG_ENABLE_OBJC_ARC)
Check if ARC is enabled/disabled
Preprocessor __has_feature function is used
__has_feature(objc_arc)
Compile time
// error if ARC is Off. Force to enable ARC
#if ! __has_feature(objc_arc)
#error Please enable ARC for this file
#endif
//or
// error if ARC is On. Force to disable ARC
#if __has_feature(objc_arc)
#error Please disable ARC for this file
#endif
Runtime
#if __has_feature(objc_arc)
// ARC is On
NSLog(#"ARC on");
#else
// ARC is Off
NSLog(#"ARC off");
#endif
Reverse engineering(for Objective-C)
//ARC is enabled
otool -I -v <binary_path> | grep "<mrc_message>"
//e.g.
otool -I -v "/Users/alex/ARC_experiments.app/ARC_experiments" | grep "_objc_release"
//result
0x00000001000080e0 748 _objc_release
//<mrc_message>
_objc_retain
_objc_release
_objc_autoreleaseReturnValue
_objc_retainAutoreleaseReturnValue
_objc_retainAutoreleasedReturnValue
_objc_storeStrong
Tool to Migrate Objective-C MRC to ARC
ARC generates errors where you should manually remove retain, release, autorelease and others issues
Edit -> Convert -> To Objective-C ARC...
New Xcode with MRC
If you enable MRC you get next errors(warnings)(but the build will be successful)
//release/retain/autorelease/retainCount
'release' is unavailable: not available in automatic reference counting mode
ARC forbids explicit message send of 'release'
There seems to be a lot of confusion on this topic (and at least 80 people who probably are now confused about this and think they need to sprinkle #autoreleasepool around their code).
If a project (including its dependencies) exclusively uses ARC, then #autoreleasepool never needs to be used and will do nothing useful. ARC will handle releasing objects at the correct time. For example:
#interface Testing: NSObject
+ (void) test;
#end
#implementation Testing
- (void) dealloc { NSLog(#"dealloc"); }
+ (void) test
{
while(true) NSLog(#"p = %p", [Testing new]);
}
#end
displays:
p = 0x17696f80
dealloc
p = 0x17570a90
dealloc
Each Testing object is deallocated as soon as the value goes out of scope, without waiting for an autorelease pool to be exited. (The same thing happens with the NSNumber example; this just lets us observe the dealloc.) ARC does not use autorelease.
The reason #autoreleasepool is still allowed is for mixed ARC and non-ARC projects, which haven't yet completely transitioned to ARC.
If you call into non-ARC code, it may return an autoreleased object. In that case, the above loop would leak, since the current autorelease pool will never be exited. That's where you'd want to put an #autoreleasepool around the code block.
But if you've completely made the ARC transition, then forget about autoreleasepool.

In which iOS version, atomic/nonatomic were added?

In which iOS version, atomic/nonatomic were added? iOS 2.0 , 4.0 etc
It's not an iOS SDK version feature. It's the Objective-C compiler's (LLVM by default) language feature.
#property (nonatomic) NSString* prop;
gets translated to machine code that's executed by iOS. One day somebody taught the compiler (in XCode) : if you see nonatomic keyword produc machine code equal to this operation:
- (void) setProp:(NSString *)prop_ {
[prop retain];
[prop release];
prop = userName_;
}
if you see atomic keyword make sure the access is synchronized:
- (void) setProp:(NSString*)prop_ {
#synchronized(self) {
[prop release];
prop = [prop_ retain];
} }
After it gets compiled every iOS version will understand it.
The last two are identical; "atomic" is the default behavior (note that it is not actually a keyword; it is specified only by the absence of nonatomic).
Assuming that you are #synthesizing the method implementations, atomic vs. non-atomic changes the generated code. If you are writing your own setter/getters, atomic/nonatomic/retain/assign/copy are merely advisory.
With "atomic", the synthesized setter/getter will ensure that a whole value is always returned from the getter or set by the setter, regardless of setter activity on any other thread. That is, if thread A is in the middle of the getter while thread B calls the setter, an actual viable value -- an autoreleased object, most likely -- will be returned to the caller in A.
In nonatomic, no such guarantees are made. Thus, nonatomic is considerably faster than "atomic".

In iOS #synchronized for 2 methods at once?

Typically #synchronized(self) creates something like critical section.
My problem is I have more than one function which should be accessed with one thread only.
But what will the application do if I write #synchronized(self) in each such method? Does it mean one thread can use method1 and other thread can use method2? If no then how to implement it correctly?
#synchronized attempts to obtain a lock on the object that is passed to it. If the lock is obtained then execution continues. If the lock can't be contained then the thread blocks until the lock can be obtained.
The object that you pass to #synchronized should be the object that you want to protect from simultaneous updates. This may be self or it may be a property of self. For example, consider the following simple queue implementation:
#property (nonatomic,strong) NSMutableArray *qArray;
-(void)append:(id)newObject {
#synchronized(self.qArray) {
[self.qArray addObject:newObject];
}
}
-(id) head {
id ret=nil;
#synchronized(self.qArray) {
if (self.qArray.count >0) {
ret=self.qArray[0];
[self.qArray removeObjectAtIndex:0];
}
}
return ret;
}
In this case self.qArray is a good choice for the #synchronized as it is the object being modified
Frome someone
The object passed to the #synchronized directive is a unique identifier used to distinguish the protected block. If you execute the preceding method in two different threads, passing a different object for the anObj parameter on each thread, each would take its lock and continue processing without being blocked by the other. If you pass the same object in both cases, however, one of the threads would acquire the lock first and the other would block until the first thread completed the critical section.
- (void)myMethod:(id)anObj
{
#synchronized(anObj)
{
// Everything between the braces is protected by the #synchronized directive.
}
}
If you access two (or more) functions via one thread. the #synchronized would don't affect your code. because your function is run synchronised without lock help.

multithread autorelease issue with ARC?

A class of service has a nonatomic property which is set in a serial queue.
#interface Service
#property (strong, nonatomic) NSDictionary *status;
#property (nonatomic) dispatch_queue_t queue;
...
#end
- (void)update:(NSDicationary *)paramDict {
dispatch_async(self.queue, ^{
....
self.status = updateDict;
}
}
- (void)someMethod {
NSDictionary *status = self.status;
}
The app crashed when the getter was invoked, at objc_autorelease + 6, which seems to be a runtime/Clang/llvm invocation.
And the crash log also showed that the status property was just set on the queue thread.
Did it crash because of the lack of atomicity in the accessors? If yes, how and why the getter failed retaining the instance? Did the autorelease pool get drained inside the synthesized nonatomic setter?
Should I implement the getter/setter method, protecting it with queue/a mutex lock?
While atomic can address some integrity issues of fundamental data types in multi-threaded code, it's not sufficient to achieve thread-safety in general. Thread-safety is generally achieved through the judicious use of locks or queues. See the Synchronization section of the Threading Programming Guide. Or see Eliminating Lock-Based Code of the Concurrency Programming Guide that describes the use of queues in lieu of synchronization locks.
Assuming your queue is serial, you might be able to make it thread-safe by using the following construct:
- (void)someMethod {
dispatch_sync(self.queue, ^{
NSDictionary *status = self.status;
// do what you need to with status
});
}
That way, you're effectively using your serial queue to synchronize access to the status dictionary.
By the way, if your queue is a custom concurrent, you might also want to make sure that you replace the dispatch_async in paramDict to be dispatch_barrier_async. If your queue is serial, then dispatch_async is fine.
I'd suggest you try synchronizing your access to status, either using your queue or one of the synchronization techniques described in the Threading Programming Guide.
If you don't mind, change the code like this.
nonatomic -> atomic

Does #synchronized guarantees for thread safety or not?

With reference to this answer, I am wondering is this correct?
#synchronized does not make any code "thread-safe"
As I tried to find any documentation or link to support this statement, for no success.
Any comments and/or answers will be appreciated on this.
For better thread safety we can go for other tools, this is known to me.
#synchronized does make code thread safe if it is used properly.
For example:
Lets say I have a class that accesses a non thread safe database. I don't want to read and write to the database at the same time as this will likely result in a crash.
So lets say I have two methods. storeData: and readData on a singleton class called LocalStore.
- (void)storeData:(NSData *)data
{
[self writeDataToDisk:data];
}
- (NSData *)readData
{
return [self readDataFromDisk];
}
Now If I were to dispatch each of these methods onto their own thread like so:
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
[[LocalStore sharedStore] storeData:data];
});
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
[[LocalStore sharedStore] readData];
});
Chances are we would get a crash. However if we change our storeData and readData methods to use #synchronized
- (void)storeData:(NSData *)data
{
#synchronized(self) {
[self writeDataToDisk:data];
}
}
- (NSData *)readData
{
#synchronized(self) {
return [self readDataFromDisk];
}
}
Now this code would be thread safe. It is important to note that if I remove one of the #synchronized statements however the code would no longer be thread safe. Or if I were to synchronize different objects instead of self.
#synchronized creates a mutex lock on the object you are syncrhonizing. So in other words if any code wants to access code in a #synchronized(self) { } block it will have to get in line behind all previous code running within in that same block.
If we were to create different localStore objects, the #synchronized(self) would only lock down each object individually. Does that make sense?
Think of it like this. You have a whole bunch of people waiting in separate lines, each line is numbered 1-10. You can choose what line you want each person to wait in (by synchronizing on a per line basis), or if you don't use #synchronized you can jump straight to the front and skip all the lines. A person in line 1 doesn't have to wait for a person in line 2 to finish, but the person in line 1 does have to wait for everyone in front of them in their line to finish.
I think the essence of the question is:
is the proper use of synchronize able to solve any thread-safe
problem?
Technically yes, but in practice it's advisable to learn and use other tools.
I'll answer without assuming previous knowledge.
Correct code is code that conforms to its specification. A good specification defines
invariants constraining the state,
preconditions and postconditions describing the effects of the operations.
Thread-safe code is code that remains correct when executed by multiple threads. Thus,
No sequence of operations can violate the specification.1
Invariants and conditions will hold during multithread execution without requiring additional synchronization by the client2.
The high level takeaway point is: thread-safe requires that the specification holds true during multithread execution. To actually code this, we have to do just one thing: regulate the access to mutable shared state3. And there are three ways to do it:
Prevent the access.
Make the state immutable.
Synchronize the access.
The first two are simple. The third one requires preventing the following thread-safety problems:
liveness
deadlock: two threads block permanently waiting for each other to release a needed resource.
livelock: a thread is busy working but it's unable to make any progress.
starvation: a thread is perpetually denied access to resources it needs in order to make progress.
safe publication: both the reference and the state of the published object must be made visible to other threads at the same time.
race conditions A race condition is a defect where the output is dependent on the timing of uncontrollable events. In other words, a race condition happens when getting the right answer relies on lucky timing. Any compound operation can suffer a race condition, example: “check-then-act”, “put-if-absent”. An example problem would be if (counter) counter--;, and one of several solutions would be #synchronize(self){ if (counter) counter--;}.
To solve these problems we use tools like #synchronize, volatile, memory barriers, atomic operations, specific locks, queues, and synchronizers (semaphores, barriers).
And going back to the question:
is the proper use of #synchronize able to solve any thread-safe
problem?
Technically yes, because any tool mentioned above can be emulated with #synchronize. But it would result in poor performance and increase the chance of liveness related problems. Instead, you need to use the appropriate tool for each situation. Example:
counter++; // wrong, compound operation (fetch,++,set)
#synchronize(self){ counter++; } // correct but slow, thread contention
OSAtomicIncrement32(&count); // correct and fast, lockless atomic hw op
In the case of the linked question you could indeed use #synchronize, or a GCD read-write lock, or create a collection with lock stripping, or whatever the situation calls for. The right answer depend on the usage pattern. Any way you do it, you should document in your class what thread-safe guarantees are you offering.
1 That is, see the object on an invalid state or violate the pre/post conditions.
2 For example, if thread A iterates a collection X, and thread B removes an element, execution crashes. This is non thread-safe because the client will have to synchronize on the intrinsic lock of X (synchronize(X)) to have exclusive access. However, if the iterator returns a copy of the collection, the collection becomes thread-safe.
3 Immutable shared state, or mutable non shared objects are always thread-safe.
Generally, #synchronized guarantees thread safety, but only when used correctly. It is also safe to acquire the lock recursively, albeit with limitations I detail in my answer here.
There are several common ways to use #synchronized wrong. These are the most common:
Using #synchronized to ensure atomic object creation.
- (NSObject *)foo {
#synchronized(_foo) {
if (!_foo) {
_foo = [[NSObject alloc] init];
}
return _foo;
}
}
Because _foo will be nil when the lock is first acquired, no locking will occur and multiple threads can potentially create their own _foo before the first completes.
Using #synchronized to lock on a new object each time.
- (void)foo {
#synchronized([[NSObject alloc] init]) {
[self bar];
}
}
I've seen this code quite a bit, as well as the C# equivalent lock(new object()) {..}. Since it attempts to lock on a new object each time, it will always be allowed into the critical section of code. This is not some kind of code magic. It does absolutely nothing to ensure thread safety.
Lastly, locking on self.
- (void)foo {
#synchronized(self) {
[self bar];
}
}
While not by itself a problem, if your code uses any external code or is itself a library, it can be an issue. While internally the object is known as self, it externally has a variable name. If the external code calls #synchronized(_yourObject) {...} and you call #synchronized(self) {...}, you may find yourself in deadlock. It is best to create an internal object to lock upon that is not exposed outside of your object. Adding _lockObject = [[NSObject alloc] init]; inside your init function is cheap, easy, and safe.
EDIT:
I still get asked questions about this post, so here is an example of why it is a bad idea to use #synchronized(self) in practice.
#interface Foo : NSObject
- (void)doSomething;
#end
#implementation Foo
- (void)doSomething {
sleep(1);
#synchronized(self) {
NSLog(#"Critical Section.");
}
}
// Elsewhere in your code
dispatch_queue_t queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
Foo *foo = [[Foo alloc] init];
NSObject *lock = [[NSObject alloc] init];
dispatch_async(queue, ^{
for (int i=0; i<100; i++) {
#synchronized(lock) {
[foo doSomething];
}
NSLog(#"Background pass %d complete.", i);
}
});
for (int i=0; i<100; i++) {
#synchronized(foo) {
#synchronized(lock) {
[foo doSomething];
}
}
NSLog(#"Foreground pass %d complete.", i);
}
It should be obvious to see why this happens. Locking on foo and lock are called in different orders on the foreground VS background threads. It's easy to say that this is bad practice, but if Foo is a library, the user is unlikely to know that the code contains a lock.
#synchronized alone doesn't make code thread safe but it is one of the tools used in writing thread safe code.
With multi-threaded programs, it's often the case of a complex structure that you want to be maintained in a consistent state and you want only one thread to have access at a time. The common pattern is to use a mutex to protect a critical section of code where the structure is accessed and/or modified.
#synchronized is thread safe mechanism. Piece of code written inside this function becomes the part of critical section, to which only one thread can execute at a time.
#synchronize applies the lock implicitly whereas NSLock applies it explicitly.
It only assures the thread safety, not guarantees that. What I mean is you hire an expert driver for you car, still it doesn't guarantees car wont meet an accident. However probability remains the slightest.
It's companion in GCD(grand central dispatch) is dispatch_once. dispatch_once does the same work as to #synchronized.
The #synchronized directive is a convenient way to create mutex locks on the fly in Objective-C code.
side-effects of mutex locks:
deadlocks
starvation
Thread safety will depend on usage of #synchronized block.

Resources