Is copying a collection before iteration enough to prevent synchronization problems? - ios

I have a sessions property, a mutable set. I need to iterate over the collection, but at the same time I could change the collection in another method:
- (Session*) sessionWithID: (NSString*) sessionID
{
for (Session *candidate in _sessions) {
/* do something */
}
return nil;
}
- (void) doSomethingElse
{
[_sessions removeObject:…];
}
This isn’t thread-safe. A bullet-proof version would be using #synchronized or a dispatch queue to serialize the _sessions access. But how reasonable is to simply copy the set before iterating over it?
- (Session*) sessionWithID: (NSString*) sessionID
{
for (Session *candidate in [_sessions copy]) {
/* do something */
}
return nil;
}
I don’t care about the performance difference much.

But how reasonable is to simply copy the set before iterating over it?
As presented, it is not guaranteed to be thread safe. You would need to guarantee that _sessions is not mutated during -copy. Then iterating over an immutable copy is safe, and mutation of _sessions may occur on a secondary thread or in your implementation.
In many cases with Cocoa collections, you will find it is preferable to use immutable ivars and copy on set by declaring the property as copy of type NSSet. This way, you copy on write/set, and then avoid the copy on read. This has the potential to reduce copies, depending on how your program actually executes. Generally, this alone is not enough, and you will need some higher level of synchronization.
Also remember that the Sessions in the set may not be thread safe. Even once your collections accesses are properly guarded, you may need to protect access to those objects.

Your code does not look thread-safe to me because the collection might be mutated from another thread while it is copied.
You would have to protect [_sessions copy] and [_sessions removeObject:…] from
executing simultaneously.
After creating the copy, you can iterate over it without a lock (assuming that the collection elements themselves are not modified from another thread).

In one of my projects I have a background simulation that a GLView is drawn based on. In order to do the drawing in a background thread I need to copy the simulation's current frame data, then perform the drawing based on that data so that the simulation can continue in it's own thread and not distort the drawing data.
I see the copying of information to be used asynchronously as perfectly valid. Especially in devices that have multiple cores. #synchronize causes the separate threads to stop (if they are accessing the same information) and thereby can cause more of a performance loss than the copy procedure.

Related

Atomic NSMutableArray being processed by two different threads

I was asked during a technical interview this following question that confused me:
if there is an atomic NSMutableArray that being modified by two different threads. What are the risks for that scenario? Would that cause a crash? and how to avoid them?
Can anyone tell me why there would be any risks? atomic is a thread safe isn't it?
Thanks
The atomic property attribute does not refer (directly) to thread safety. It refers to the fact that the compiler will synthesize the ivar and getter/setter methods. If you want to provide your own getter/setter, for example, you mark the property as nonatomic and then write your getter/setting; the compiler will not generate an ivar.
Atomic property mutations are generally thread safe, but that's largely a side effect of modern CPUs. For example, setting an array property with a new object reference is generally thread safe. In other words, if two threads are setting a reference property at the same time exactly one will succeed; you won't end up with a weird half-reference that points off into space.
However, simply because the reference to an object is thread safe it does not make the object it refers to thread safe.
As a rule, any mutable object must use semaphores or some similar technique to safely mutate its state from multiple threads (or arrange that all access be performed from the same thread).
By far the simplest is to use semaphores. Surround or wrap any code that modifies or accesses the object with code that holds a semaphore until the operation is finished:
#implementation SafeCollection
{
NSLock* collectionLock;
NSMutableArray* collection;
}
- (void)addToCollection:(id)obj
{
[collectionLock lock];
[collection addObject:obj];
[collectionLock unlock];
}
- (id)objectInCollectionAtIndex:(NSUInteger)index
{
[collectionLock lock];
id obj = collection[index];
[collectionLock unlock];
return obj;
}
Thread safety is an expansive and complex topic, but the basics for two threads attempting to manipulate a mutable resource are pretty straight forward.

How to make a custom class thread-safe in Objective-C

I've a TableViewController that uses the item at index N for the table view cell at row N. Since array index N may be accessed from different threads, I created a ThreadSafeMutableArray class that does the reads inside a dispatch_sync and writes under a dispatch_barrier_async.
Suppose I get the object at index N, say using Song *currSong = self.entries[N];, and then make changes to the properties of this object. Am I correct in understanding that I need to make these changes in a thread-safe way (because for e.g, tableview may ask for the object at cell N and at the same time the object in cell N may be updated because the image object for which it was received from the network)? If yes, what is the simplest way to make my custom class thread-safe?
For example : In the ThreadSafeMutableArray case, I was able to achieve it by over-riding following methods and using dispatch_sync and dispatch_barrier_async within the new implementation of the methods.
-(NSUInteger) count
-(id) objectAtIndex:(NSUInteger)index;
-(void) insertObject:(id)anObject atIndex:(NSUInteger)index;
-(void) removeObjectAtIndex:(NSUInteger)index;
-(void) addObject:(id)anObject;
-(void) removeLastObject;
-(void) replaceObjectAtIndex:(NSUInteger)index withObject:(id)anObject;
You need to determine what "thread safe" means in the context of your custom class/application. You might just want data integrity, meaning that no thread sees an invalid or partial stored value, e.g. Think of atomic read/write operations; or you might required model integrity, e.g. where the interrelationships of multiple items is always correct - as in your mutable array; or something between these, e.g. think of counter incrementing - it is not as involved as keeping the graph of objects representing a mutable structure consistent, but more involved than simple atomic read or write. Etc., etc., thread safety is a big topic!
Once you know what your custom object requires you can select from atomic properties for simple read/write integrity, locks for more complex combinations, combinations of GCD sync, async, barrier, sequential and concurrent queues etc.
In short there is no single simple answer. Study the various options, consider your requirements, and pick and choose. You are already using GCD to achieve thread safety, that is good! If you come up with a design and have issues with it you can always ask SO.
You might find this article interesting on the benefits, or otherwise, of atomic properties. The writer is probably being a bit harsh on atomic to make a point, but it is certainly worth a read.
HTH
The easiest way to achieve this is to create a singel access mehtod in your TableViewController and use the #syncrhonized directive to protect access.
- (void)updateObjectAt:(NSUInteger)index {
#synchronized(itemArray) {
// Everything between the braces is protected by the #synchronized directive.
itemArray[index].update();
}
}
The #synchornized directive puts a lock on the array, anything within the code block can safely access and change items in the array. If any other methods need to access the array simply wrap it in a #syncrhonized lock on the array aswell.

-allKeys on background thread results in error: __NSDictionaryM was mutated while being enumerated

I've come across an interesting issue using mutable dictionaries on background threads.
Currently, I am downloading data in chunks on one thread, adding it to a data set, and processing it on another background thread. The overall design works for the most part aside from one issue: On occasion, a function call to an inner dictionary within the main data set causes the following crash:
*** Collection <__NSDictionaryM: 0x13000a190> was mutated while being enumerated.
I know this is a fairly common crash to have, but the strange part is that it's not crashing in a loop on this collection. Instead, the exception breakpoint in Xcode is stopping on the following line:
NSArray *tempKeys = [temp allKeys];
This leads me to believe that one thread is adding items to this collection while the NSMutableDictionary's internal function call to -allKeys is enumerating over the keys in order to return the array on another thread.
My question is: Is this what's happening? If so, what would be the best way to avoid this?
Here's the gist of what I'm doing:
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^(void) {
for (NSString *key in [[queue allKeys] reverseObjectEnumerator]) { //To prevent crashes
NEXActivityMap *temp = queue[key];
NSArray *tempKeys = [temp allKeys]; //<= CRASHES HERE
if (tempKeys.count > 0) {
//Do other stuff
}
}
});
You can use #synchronize. And it will work. But this is mixing up two different ideas:
Threads have been around for many years. A new thread opens a new control flow. Code in different threads are running potentially concurrently causing conflicts as you had. To prevent this conflicts you have to use locks like #synchronized do.
GCD is the more modern concept. GCD runs "on top of threads" that means, it uses threads, but this is transparent for you. You do not have to care about this. Code running in different queues are running potentially concurrently causing conflicts. To prevent this conflicts you have to use one queue for shared resources.
You are already using GCD, what is a good idea:
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^(void) {
The same code with threads would look like this:
[[NSThread mainThread] performSelector:…];
So, using GCD, you should use GCD to prevent the conflicts. What you are doing is to use GCD wrongly and then "repair" that with locks.
Simply put all accesses to the shared resource (in your case the mutable dictionary referred by temp) into on serial queue.
Create a queue at the beginning for the accesses. This is a one-timer.
You can use one of the existing queues as you do in your code, but you have to use a serial one! But this potentially leads to long queues with waiting tasks (in your example blocks). Different tasks in a serial queue are executed one after each other, even there are cpu cores idle. So it is no good idea to put too many tasks into one queue. Create a queue for any shared resource or "subsystem":
dispatch_queue_t tempQueue;
tempQueue = dispatch_queue_create("tempQueue", NULL);
When code wants to access the mutable dictionary, put it in a queue:
It looks like this:
dispatch_sync( tempQueue, // or async, if it is possible
^{
[tempQueue setObject:… forKey:…]; // Or what you want to do.
}
You have to put every code accessing the shared resource in the queue as you have to put every code accessing the shared resource inn locks when using threads.
From Apple documentation "Thread safety summary":
Mutable objects are generally not thread-safe. To use mutable objects
in a threaded application, the application must synchronize access to
them using locks. (For more information, see Atomic Operations). In
general, the collection classes (for example, NSMutableArray,
NSMutableDictionary) are not thread-safe when mutations are concerned.
That is, if one or more threads are changing the same array, problems
can occur. You must lock around spots where reads and writes occur to
assure thread safety.
In your case, following scenario happens. From one thread, you add elements into dictionary. In another thread, you accessing allKeys method. While this methods copies all keys into array, other methods adds new key. This causes exception.
To avoid that, you have several options.
Because you are using dispatch queues, preferred way is to put all code, that access same mutable dictionary instance, into private serial dispatch queue.
Second option is passing immutable dictionary copy to other thread. In this case, no matter what happen in first thread with original dictionary, data still will be consistent. Note that you will probably need deep copy, cause you use dictionary/arrays hierarchy.
Alternatively you can wrap all points, where you access collections, with locks. Using #synchronized also implicitly create recursive lock for you.
How about wrapping where you get the keys AND where you set the keys, with #synchronize?
Example:
- (void)myMethod:(id)anObj
{
#synchronized(anObj)
{
// Everything between the braces is protected by the #synchronized directive.
}
}

How do I stop an NSSet from being mutated while it's being enumerated elsewhere?

I have an NSMutableSet of objects which I mutate at several points on the main thread. There are times though when this set will be enumerated on a background thread. Short of using a bunch of booleans to manage state, is there a good way to make any mutations wait, should the set currently be enumerating?
Thanks in advance.
NSMutableSet is not thread safe. You cannot hack it to be thread safe.
If you want a copy that won't mutate for inspection in the background then take a copy on the main queue and pass that back. It'll be a lot faster than any other approach.
Otherwise you'll have to wrap all accesses to it in a mutex. #synchronizeing on the set itself would be the most straightforward way.

#synchronized block versus GCD dispatch_async()

Essentially, I have a set of data in an NSDictionary, but for convenience I'm setting up some NSArrays with the data sorted and filtered in a few different ways. The data will be coming in via different threads (blocks), and I want to make sure there is only one block at a time modifying my data store.
I went through the trouble of setting up a dispatch queue this afternoon, and then randomly stumbled onto a post about #synchronized that made it seem like pretty much exactly what I want to be doing.
So what I have right now is...
// a property on my object
#property (assign) dispatch_queue_t matchSortingQueue;
// in my object init
_sortingQueue = dispatch_queue_create("com.asdf.matchSortingQueue", NULL);
// then later...
- (void)sortArrayIntoLocalStore:(NSArray*)matches
{
dispatch_async(_sortingQueue, ^{
// do stuff...
});
}
And my question is, could I just replace all of this with the following?
- (void)sortArrayIntoLocalStore:(NSArray*)matches
{
#synchronized (self) {
// do stuff...
};
}
...And what's the difference between the two anyway? What should I be considering?
Although the functional difference might not matter much to you, it's what you'd expect: if you #synchronize then the thread you're on is blocked until it can get exclusive execution. If you dispatch to a serial dispatch queue asynchronously then the calling thread can get on with other things and whatever it is you're actually doing will always occur on the same, known queue.
So they're equivalent for ensuring that a third resource is used from only one queue at a time.
Dispatching could be a better idea if, say, you had a resource that is accessed by the user interface from the main queue and you wanted to mutate it. Then your user interface code doesn't need explicitly to #synchronize, hiding the complexity of your threading scheme within the object quite naturally. Dispatching will also be a better idea if you've got a central actor that can trigger several of these changes on other different actors; that'll allow them to operate concurrently.
Synchronising is more compact and a lot easier to step debug. If what you're doing tends to be two or three lines and you'd need to dispatch it synchronously anyway then it feels like going to the effort of creating a queue isn't worth it — especially when you consider the implicit costs of creating a block and moving it over onto the heap.
In the second case you would block the calling thread until "do stuff" was done. Using queues and dispatch_async you will not block the calling thread. This would be particularly important if you call sortArrayIntoLocalStore from the UI thread.

Resources