Thread safety crash using NSDictionary? - ios

I have code accessing and setting an NSDictionary many times across multiple threads, like so:
- (BOOL)flagForItem:(NSNumber*)itemID
{
if(itemID) {
return [[_itemFlagDict objectForKey:itemID] boolValue]
}
return NO;
}
and:
- (void)setFlagForItem:(NSNumber*)itemID
{
if(itemID) {
NSMutableDictionary *copy = [_itemFlagDict mutableCopy];
[copy setObject:[NSNumber numberWithBool:YES] forKey:itemID];
_itemFlagDict = [NSDictionary dictionaryWithDictionary:copy];
}
}
In the set method, I originally had a NSMutableDictionary - this was changed to the pattern you see now because, doh, NSMutableDictionary isn't threadsafe. My reasoning was to perform the mutation in a copy, and then reassign the _itemFlagDict to capture the update.
However, occasionally a EXC_BAD_ACCESS crash still occurs when accessing the _itemFlagDict, leading me to believe that the dictionary is reassigned WHILE the accessing the objectForKey:itemID.
One other approach I tried was to use #synchronized(_itemFlagDict) on both the accessor and the setter methods. While this fixed the issue, this code is performance sensitive and synchronizing the access/assignment caused too much performance degradation.
So my question is, are there other patterns/methods I can use to avoid this bad access while not compromising performance? If prioritization matters, the execution (and not necessarily iron-clad accuracy) of the accessor method is most important.
Note: i'm working with iOS 4 and above

Have you tried read/write locks ? You can have multiple threads in your get methods and one writer in the set method https://developer.apple.com/library/mac/#documentation/Darwin/Reference/ManPages/man3/pthread.3.html

I have encountered the same problem, and my current solution is use Atomic operation to implement a thread-safe dictionary.
You can check out it: https://github.com/bangquangvn/FastestThreadsafeDictionary-iOS
I think it's the fastest solution.

Related

atomic make the retainCount+1

I don't often use property atomic, but I found something is strange. My test file is not use ARC.
I use a property #property(atomic,retain) NSArray* test;
Then I just make a test for the thing in the init method.
like this
1) NSArray* testArray = [NSArray arrayWithObject:#"1"];
2) self.test = testArray;
After executing 1)
[testArray retainCount] = 1
After executing 2)
[testArray retianCount] = 2
[self.test retainCount] = 3
[_test retainCount] = 3.
Then I change property atomic to nonatomic. After executing 1)
[testArray retainCount] = 1
After executing 2)
[testArray retianCount] = 2
[self.test retainCount] = 2
[_test retainCount] = 2.
so I don't know why. atomic can add retainCount?
You shouldn't do Manual Reference Counting in this day and age. Seriously, there is no drawback to using ARC. The only reason I can think of is, if you need to maintain legacy code that can't be converted to ARC (due to resource allocation priorities, etc.).
Even back in the days of MRC, Apple strongly discouraged the direct manipulation of the retainCount property: that is just an implementation detail and relying on it will just make your app fragile.
You should instead design your app around object graphs (ownership relationships), whether with ARC or MRC.
I think atomic put a object to current Autorelease poll. It is needed to make the object alive while you work with it (in this moment the object may be released on an other thread).
Try to wrap
NSArray* testArray = [NSArray arrayWithObject:#"1"];
self.test = testArray;
with AutoreleasePool, and check retainCount after (Then code exit from autorelease poll).
I think you will get retainCount == 1.
The absolute value of the retainCount is completely useless. You can't infer meaning from it about an objects lifespan nor is it particularly useful for debugging.
See http://www.whentouseretaincount.com for more information.
As for your specific case, the retain count is changing behavior across atomic due to implementation details. If you switch between optimized and non-optimized builds, you'll probably see yet different results.

How do I properly dealloc and release objects being referred to in a while loop

The following code creates a memory leak. An asynchronous background process downloads images in tmp_pack_folder and another background thread is checking if the image count matches the total count expected, and then makes the images available to users once the download is complete.
The issue is that if the background process that is downloading images to the tmp_pack_folder fails for some reason, the following code becomes an infinite loop. This is a rare case, but when it does there is a memory leak. getAllFileNamesinFolder method is actually calling contentsOfDirectoryAtPath:bundleRoot of NSFileManager and it is called repeatedly. How to do I properly deallocate memory in this case (apart from preventing the infinite loop to begin with)
NSString *tmp_pack_folder = [packid stringByAppendingString:#"_tmp"];
if([fileMgr folderExists: tmp_pack_folder]){
NSArray *packImages = [fileMgr getAllFileNamesInFolder:tmp_pack_folder];
while(packImages.count != arrImages.count ){
packImages = [fileMgr getAllFileNamesInFolder:tmp_pack_folder]; //get the contents of the folder again.
if(cancel==YES){
break;
}
}
}
You say that you will rework this to "prevent the infinite loop." You should take that a step further and eliminate the loop altogether. If you ever find yourself with code that loops, polling some status, there's invariably an alternate, more efficient design. Bottom line, your memory situation is not the real problem: It's merely a symptom of a broader design issue.
I'd advise you move to an event-driven approach. So, rather than having a method that repeatedly performs the "am I done yet" logic, you should only check this status when triggered by the appropriate event (i.e. only when a download finishes/fails, and not before). This loop is probably causing to your memory problem, so don't fix the memory problem, but rather eliminate the loop altogether.
In answer to your question, one possible source of the memory problem arises from autorelease objects. These are objects that are allocated in such a manner that they are not released immediately when you're done with them, but rather only when the autorelease pool is drained (which generally happens for you automatically when you yield back to the app's run loop). But if you have some large loop that you repeatedly call, you end up adding lots of objects to an autorelease pool that isn't drained in a timely manner.
In some special cases, if you truly needed some loop (and to be clear, that's not the case here; you neither need nor want a loop in this case), you could employ your own custom #autoreleasepool, through which you'd effectively control the frequency of the draining of the pool.
But, at the risk of belaboring the point, this is simply not one of those situations. Don't use your own autorelease pool. Get rid of the loop. Only trigger the "am I done yet" logic when a download finishes/fails, and your problem should go away.
It's too bad Objective-C doesn't give us javascript-like promises. The way I solve this problem is by giving my asynch task a caller's interface like this:
- (void)doAsynchThingWithParams:(id)params completion:(void (^)(id))completion;
The params parameterize whatever the task is, and the completion handler takes result of the task.
This let's me treat several concurrent tasks like a todo list, with a completion handler that gets called with all the results once they've arrived.
// array is an array of params for each task e.g. urls for making url requests
// completion is called when all are complete with an array of results
- (void)doManyThingsWithParams:(NSArray *)array completion:(void (^)(NSArray *))completion {
NSMutableArray *todoList = [array mutableCopy];
NSMutableArray *results = [NSMutableArray array];
// results will always have N elements, one for each task
// nulls can be replaced by either good results or NSErrors
for (int i=0; i<array.count; ++i) results[i] = [NSNull null];
for (id params in array) {
[self doAsynchThingWithParams:params completion:^(id result) {
if (result) {
NSInteger index = [array indexOfObject:params];
[results replaceObjectAtIndex:index withObject:result];
}
[todoList removeObject:params];
if (!todoList.count) completion(results);
}];
}
}

Does #synchronized guarantees for thread safety or not?

With reference to this answer, I am wondering is this correct?
#synchronized does not make any code "thread-safe"
As I tried to find any documentation or link to support this statement, for no success.
Any comments and/or answers will be appreciated on this.
For better thread safety we can go for other tools, this is known to me.
#synchronized does make code thread safe if it is used properly.
For example:
Lets say I have a class that accesses a non thread safe database. I don't want to read and write to the database at the same time as this will likely result in a crash.
So lets say I have two methods. storeData: and readData on a singleton class called LocalStore.
- (void)storeData:(NSData *)data
{
[self writeDataToDisk:data];
}
- (NSData *)readData
{
return [self readDataFromDisk];
}
Now If I were to dispatch each of these methods onto their own thread like so:
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
[[LocalStore sharedStore] storeData:data];
});
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
[[LocalStore sharedStore] readData];
});
Chances are we would get a crash. However if we change our storeData and readData methods to use #synchronized
- (void)storeData:(NSData *)data
{
#synchronized(self) {
[self writeDataToDisk:data];
}
}
- (NSData *)readData
{
#synchronized(self) {
return [self readDataFromDisk];
}
}
Now this code would be thread safe. It is important to note that if I remove one of the #synchronized statements however the code would no longer be thread safe. Or if I were to synchronize different objects instead of self.
#synchronized creates a mutex lock on the object you are syncrhonizing. So in other words if any code wants to access code in a #synchronized(self) { } block it will have to get in line behind all previous code running within in that same block.
If we were to create different localStore objects, the #synchronized(self) would only lock down each object individually. Does that make sense?
Think of it like this. You have a whole bunch of people waiting in separate lines, each line is numbered 1-10. You can choose what line you want each person to wait in (by synchronizing on a per line basis), or if you don't use #synchronized you can jump straight to the front and skip all the lines. A person in line 1 doesn't have to wait for a person in line 2 to finish, but the person in line 1 does have to wait for everyone in front of them in their line to finish.
I think the essence of the question is:
is the proper use of synchronize able to solve any thread-safe
problem?
Technically yes, but in practice it's advisable to learn and use other tools.
I'll answer without assuming previous knowledge.
Correct code is code that conforms to its specification. A good specification defines
invariants constraining the state,
preconditions and postconditions describing the effects of the operations.
Thread-safe code is code that remains correct when executed by multiple threads. Thus,
No sequence of operations can violate the specification.1
Invariants and conditions will hold during multithread execution without requiring additional synchronization by the client2.
The high level takeaway point is: thread-safe requires that the specification holds true during multithread execution. To actually code this, we have to do just one thing: regulate the access to mutable shared state3. And there are three ways to do it:
Prevent the access.
Make the state immutable.
Synchronize the access.
The first two are simple. The third one requires preventing the following thread-safety problems:
liveness
deadlock: two threads block permanently waiting for each other to release a needed resource.
livelock: a thread is busy working but it's unable to make any progress.
starvation: a thread is perpetually denied access to resources it needs in order to make progress.
safe publication: both the reference and the state of the published object must be made visible to other threads at the same time.
race conditions A race condition is a defect where the output is dependent on the timing of uncontrollable events. In other words, a race condition happens when getting the right answer relies on lucky timing. Any compound operation can suffer a race condition, example: “check-then-act”, “put-if-absent”. An example problem would be if (counter) counter--;, and one of several solutions would be #synchronize(self){ if (counter) counter--;}.
To solve these problems we use tools like #synchronize, volatile, memory barriers, atomic operations, specific locks, queues, and synchronizers (semaphores, barriers).
And going back to the question:
is the proper use of #synchronize able to solve any thread-safe
problem?
Technically yes, because any tool mentioned above can be emulated with #synchronize. But it would result in poor performance and increase the chance of liveness related problems. Instead, you need to use the appropriate tool for each situation. Example:
counter++; // wrong, compound operation (fetch,++,set)
#synchronize(self){ counter++; } // correct but slow, thread contention
OSAtomicIncrement32(&count); // correct and fast, lockless atomic hw op
In the case of the linked question you could indeed use #synchronize, or a GCD read-write lock, or create a collection with lock stripping, or whatever the situation calls for. The right answer depend on the usage pattern. Any way you do it, you should document in your class what thread-safe guarantees are you offering.
1 That is, see the object on an invalid state or violate the pre/post conditions.
2 For example, if thread A iterates a collection X, and thread B removes an element, execution crashes. This is non thread-safe because the client will have to synchronize on the intrinsic lock of X (synchronize(X)) to have exclusive access. However, if the iterator returns a copy of the collection, the collection becomes thread-safe.
3 Immutable shared state, or mutable non shared objects are always thread-safe.
Generally, #synchronized guarantees thread safety, but only when used correctly. It is also safe to acquire the lock recursively, albeit with limitations I detail in my answer here.
There are several common ways to use #synchronized wrong. These are the most common:
Using #synchronized to ensure atomic object creation.
- (NSObject *)foo {
#synchronized(_foo) {
if (!_foo) {
_foo = [[NSObject alloc] init];
}
return _foo;
}
}
Because _foo will be nil when the lock is first acquired, no locking will occur and multiple threads can potentially create their own _foo before the first completes.
Using #synchronized to lock on a new object each time.
- (void)foo {
#synchronized([[NSObject alloc] init]) {
[self bar];
}
}
I've seen this code quite a bit, as well as the C# equivalent lock(new object()) {..}. Since it attempts to lock on a new object each time, it will always be allowed into the critical section of code. This is not some kind of code magic. It does absolutely nothing to ensure thread safety.
Lastly, locking on self.
- (void)foo {
#synchronized(self) {
[self bar];
}
}
While not by itself a problem, if your code uses any external code or is itself a library, it can be an issue. While internally the object is known as self, it externally has a variable name. If the external code calls #synchronized(_yourObject) {...} and you call #synchronized(self) {...}, you may find yourself in deadlock. It is best to create an internal object to lock upon that is not exposed outside of your object. Adding _lockObject = [[NSObject alloc] init]; inside your init function is cheap, easy, and safe.
EDIT:
I still get asked questions about this post, so here is an example of why it is a bad idea to use #synchronized(self) in practice.
#interface Foo : NSObject
- (void)doSomething;
#end
#implementation Foo
- (void)doSomething {
sleep(1);
#synchronized(self) {
NSLog(#"Critical Section.");
}
}
// Elsewhere in your code
dispatch_queue_t queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
Foo *foo = [[Foo alloc] init];
NSObject *lock = [[NSObject alloc] init];
dispatch_async(queue, ^{
for (int i=0; i<100; i++) {
#synchronized(lock) {
[foo doSomething];
}
NSLog(#"Background pass %d complete.", i);
}
});
for (int i=0; i<100; i++) {
#synchronized(foo) {
#synchronized(lock) {
[foo doSomething];
}
}
NSLog(#"Foreground pass %d complete.", i);
}
It should be obvious to see why this happens. Locking on foo and lock are called in different orders on the foreground VS background threads. It's easy to say that this is bad practice, but if Foo is a library, the user is unlikely to know that the code contains a lock.
#synchronized alone doesn't make code thread safe but it is one of the tools used in writing thread safe code.
With multi-threaded programs, it's often the case of a complex structure that you want to be maintained in a consistent state and you want only one thread to have access at a time. The common pattern is to use a mutex to protect a critical section of code where the structure is accessed and/or modified.
#synchronized is thread safe mechanism. Piece of code written inside this function becomes the part of critical section, to which only one thread can execute at a time.
#synchronize applies the lock implicitly whereas NSLock applies it explicitly.
It only assures the thread safety, not guarantees that. What I mean is you hire an expert driver for you car, still it doesn't guarantees car wont meet an accident. However probability remains the slightest.
It's companion in GCD(grand central dispatch) is dispatch_once. dispatch_once does the same work as to #synchronized.
The #synchronized directive is a convenient way to create mutex locks on the fly in Objective-C code.
side-effects of mutex locks:
deadlocks
starvation
Thread safety will depend on usage of #synchronized block.

Thread safety of NSMutableDictionary access and destruction

I have an application that downloads information from a web service and caches it in memory. Specifically, my singleton cache class contains an instance variable NSMutableDictionary *memoryDirectory which contains all of the cached data. The data in this cache can be redownloaded easily, so when I receive a UIApplicationDidReceiveMemoryWarningNotification I call a method to simply invoke
- (void) dumpCache:(NSNotification *)notification
{
memoryDirectory = nil;
}
I’m a little worried about the thread safety here. (I’ll admit I don’t know much about threads in general, much less in Cocoa’s implementation.) The cache is a mutable dictionary whose values are mutable dictionaries, so there are two levels of keys to access data. When I write to the cache I do something like this:
- (void) addDataToCache:(NSData *)data
forKey:(NSString *)
subkey:(NSString *)subkey
{
if (!memoryDirectory)
memoryDirectory = [[NSMutableDictionary alloc] init];
NSMutableDictionary *methodDictionary = [memoryDirectory objectForKey:key];
if (!methodDictionary) {
[memoryDirectory setObject:[NSMutableDictionary dictionary] forKey:key];
methodDictionary = [memoryDirectory objectForKey:key];
}
[methodDictionary setObject:data forKey:subkey];
}
I’m worried that sometime in the middle of the process, dumpCache: is going to nil out the dictionary and I’m going to be left doing a bunch of setObject:forKey:s that don’t do anything. This isn’t fatal but you can imagine the problems that might come up if this happens while I’m reading the cache.
Is it sufficient to wrap all of my cache reads and writes in some kind of #synchronized block? If so, what should it look like? (And should my dumpCache: be similarly wrapped?) If not, how should I ensure that what I’m doing is safe?
Instead of using an NSMutableDictionary, consider using NSCache, which is thread safe. See this answer for example usage. Good luck!

Modifying mutable object in completion handler

I have a question about thread safety of the following code example from Apple (from GameKit programming guide)
This is to load achievements from game center and save it locally:
Step 1) Add a mutable dictionary property to your class that report achievements. This dictionary stores the collection of achievement objects.
#property(nonatomic, retain) NSMutableDictionary *achievementsDictionary;
Step 2) Initialize the achievements dictionary.
achievementsDictionary = [[NSMutableDictionary alloc] init];
Step 3) Modify your code that loads loads achievement data to add the achievement objects to the dictionary.
{
[GKAchievement loadAchievementsWithCompletionHandler:^(NSArray *achievements, NSError *error)
{
if (error == nil)
{
for (GKAchievement* achievement in achievements)
[achievementsDictionary setObject: achievement forKey: achievement.identifier];
}
}];
My question is as follows - achievementsDictionary object is being modified in the completion handler, without any locks of sort. Is this allowed because completion handlers are a block of work that will be guaranteed by iOS to be executed as unit on the main thread? And never run into thread safety issues?
In another Apple sample code (GKTapper), this part is handled differently:
#property (retain) NSMutableDictionary* earnedAchievementCache; // note this is atomic
Then in the handler:
[GKAchievement loadAchievementsWithCompletionHandler: ^(NSArray *scores, NSError *error)
{
if(error == NULL)
{
NSMutableDictionary* tempCache= [NSMutableDictionary dictionaryWithCapacity: [scores count]];
for (GKAchievement* score in scores)
{
[tempCache setObject: score forKey: score.identifier];
}
self.earnedAchievementCache= tempCache;
}
}];
So why the different style, and is one way more correct than the other?
Is this allowed because completion handlers are a block of work that will be guaranteed by iOS to be executed as unit on the main thread? And never run into thread safety issues?
This is definitely not the case here. The documentation for -loadAchievementsWithCompletionHandler: explicitly warns that the completion handler might be called on a thread other than the one you started the load from.
Apple's "Threading Programming Guide" classifies NSMutableDictionary among thread-unsafe classes, but qualifies this with, "In most cases, you can use these classes from any thread as long as you use them from only one thread at a time."
So, if both apps are designed such that nothing will be working with the achievement cache till the worker task has finished updating it, then no synchronization would be necessary. This is the only way in which I can see the first example as being safe, and it's a tenuous safety.
The latter example looks like it's relying on the atomic property support to make the switcheroo between the old cache and the new cache. This should be safe, provided all access to the property is via its accessors rather than direct ivar access. This is because the accessors are synchronized with respect to each other, so you do not risk seeing a half-set value. Also, the getter retains and autoreleases the returned value, so that code with the old version will be able to finish working with it without crashing because it was released in the middle of its work. A nonatomic getter simply returns the object directly, which means that it could be deallocated out from under your code if a new value were set for that property by another thread. Direct ivar access can run into the same problem.
I would say the latter example is both correct and elegant, though perhaps a bit over-subtle without a comment explaining how crucial the atomicity of the property is.

Resources