#synchronized block versus GCD dispatch_async() - ios

Essentially, I have a set of data in an NSDictionary, but for convenience I'm setting up some NSArrays with the data sorted and filtered in a few different ways. The data will be coming in via different threads (blocks), and I want to make sure there is only one block at a time modifying my data store.
I went through the trouble of setting up a dispatch queue this afternoon, and then randomly stumbled onto a post about #synchronized that made it seem like pretty much exactly what I want to be doing.
So what I have right now is...
// a property on my object
#property (assign) dispatch_queue_t matchSortingQueue;
// in my object init
_sortingQueue = dispatch_queue_create("com.asdf.matchSortingQueue", NULL);
// then later...
- (void)sortArrayIntoLocalStore:(NSArray*)matches
{
dispatch_async(_sortingQueue, ^{
// do stuff...
});
}
And my question is, could I just replace all of this with the following?
- (void)sortArrayIntoLocalStore:(NSArray*)matches
{
#synchronized (self) {
// do stuff...
};
}
...And what's the difference between the two anyway? What should I be considering?

Although the functional difference might not matter much to you, it's what you'd expect: if you #synchronize then the thread you're on is blocked until it can get exclusive execution. If you dispatch to a serial dispatch queue asynchronously then the calling thread can get on with other things and whatever it is you're actually doing will always occur on the same, known queue.
So they're equivalent for ensuring that a third resource is used from only one queue at a time.
Dispatching could be a better idea if, say, you had a resource that is accessed by the user interface from the main queue and you wanted to mutate it. Then your user interface code doesn't need explicitly to #synchronize, hiding the complexity of your threading scheme within the object quite naturally. Dispatching will also be a better idea if you've got a central actor that can trigger several of these changes on other different actors; that'll allow them to operate concurrently.
Synchronising is more compact and a lot easier to step debug. If what you're doing tends to be two or three lines and you'd need to dispatch it synchronously anyway then it feels like going to the effort of creating a queue isn't worth it — especially when you consider the implicit costs of creating a block and moving it over onto the heap.

In the second case you would block the calling thread until "do stuff" was done. Using queues and dispatch_async you will not block the calling thread. This would be particularly important if you call sortArrayIntoLocalStore from the UI thread.

Related

-allKeys on background thread results in error: __NSDictionaryM was mutated while being enumerated

I've come across an interesting issue using mutable dictionaries on background threads.
Currently, I am downloading data in chunks on one thread, adding it to a data set, and processing it on another background thread. The overall design works for the most part aside from one issue: On occasion, a function call to an inner dictionary within the main data set causes the following crash:
*** Collection <__NSDictionaryM: 0x13000a190> was mutated while being enumerated.
I know this is a fairly common crash to have, but the strange part is that it's not crashing in a loop on this collection. Instead, the exception breakpoint in Xcode is stopping on the following line:
NSArray *tempKeys = [temp allKeys];
This leads me to believe that one thread is adding items to this collection while the NSMutableDictionary's internal function call to -allKeys is enumerating over the keys in order to return the array on another thread.
My question is: Is this what's happening? If so, what would be the best way to avoid this?
Here's the gist of what I'm doing:
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^(void) {
for (NSString *key in [[queue allKeys] reverseObjectEnumerator]) { //To prevent crashes
NEXActivityMap *temp = queue[key];
NSArray *tempKeys = [temp allKeys]; //<= CRASHES HERE
if (tempKeys.count > 0) {
//Do other stuff
}
}
});
You can use #synchronize. And it will work. But this is mixing up two different ideas:
Threads have been around for many years. A new thread opens a new control flow. Code in different threads are running potentially concurrently causing conflicts as you had. To prevent this conflicts you have to use locks like #synchronized do.
GCD is the more modern concept. GCD runs "on top of threads" that means, it uses threads, but this is transparent for you. You do not have to care about this. Code running in different queues are running potentially concurrently causing conflicts. To prevent this conflicts you have to use one queue for shared resources.
You are already using GCD, what is a good idea:
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^(void) {
The same code with threads would look like this:
[[NSThread mainThread] performSelector:…];
So, using GCD, you should use GCD to prevent the conflicts. What you are doing is to use GCD wrongly and then "repair" that with locks.
Simply put all accesses to the shared resource (in your case the mutable dictionary referred by temp) into on serial queue.
Create a queue at the beginning for the accesses. This is a one-timer.
You can use one of the existing queues as you do in your code, but you have to use a serial one! But this potentially leads to long queues with waiting tasks (in your example blocks). Different tasks in a serial queue are executed one after each other, even there are cpu cores idle. So it is no good idea to put too many tasks into one queue. Create a queue for any shared resource or "subsystem":
dispatch_queue_t tempQueue;
tempQueue = dispatch_queue_create("tempQueue", NULL);
When code wants to access the mutable dictionary, put it in a queue:
It looks like this:
dispatch_sync( tempQueue, // or async, if it is possible
^{
[tempQueue setObject:… forKey:…]; // Or what you want to do.
}
You have to put every code accessing the shared resource in the queue as you have to put every code accessing the shared resource inn locks when using threads.
From Apple documentation "Thread safety summary":
Mutable objects are generally not thread-safe. To use mutable objects
in a threaded application, the application must synchronize access to
them using locks. (For more information, see Atomic Operations). In
general, the collection classes (for example, NSMutableArray,
NSMutableDictionary) are not thread-safe when mutations are concerned.
That is, if one or more threads are changing the same array, problems
can occur. You must lock around spots where reads and writes occur to
assure thread safety.
In your case, following scenario happens. From one thread, you add elements into dictionary. In another thread, you accessing allKeys method. While this methods copies all keys into array, other methods adds new key. This causes exception.
To avoid that, you have several options.
Because you are using dispatch queues, preferred way is to put all code, that access same mutable dictionary instance, into private serial dispatch queue.
Second option is passing immutable dictionary copy to other thread. In this case, no matter what happen in first thread with original dictionary, data still will be consistent. Note that you will probably need deep copy, cause you use dictionary/arrays hierarchy.
Alternatively you can wrap all points, where you access collections, with locks. Using #synchronized also implicitly create recursive lock for you.
How about wrapping where you get the keys AND where you set the keys, with #synchronize?
Example:
- (void)myMethod:(id)anObj
{
#synchronized(anObj)
{
// Everything between the braces is protected by the #synchronized directive.
}
}

How to dispatch_after in the current queue?

Now that dispatch_get_current_queue is deprecated in iOS 6, how do I use dispatch_after to execute something in the current queue?
The various links in the comments don't say "it's better not to do it." They say you can't do it. You must either pass the queue you want or dispatch to a known queue. Dispatch queues don't have the concept of "current." Blocks often feed from one queue to another (called "targeting"). By the time you're actually running, the "current" queue is not really meaningful, and relying on it can (and historically did) lead to dead-lock. dispatch_get_current_queue() was never meant for dispatching; it was a debugging method. That's why it was removed (since people treated it as if it meant something meaningful).
If you need that kind of higher-level book-keeping, use an NSOperationQueue which tracks its original queue (and has a simpler queuing model that makes "original queue" much more meaningful).
There are several approaches used in UIKit that are appropriate:
Pass the call-back dispatch_queue as a parameter (this is probably the most common approach in new APIs). See [NSURLConnection setDelegateQueue:] or addObserverForName:object:queue:usingBlock: for examples. Notice that NSURLConnection expects an NSOperationQueue, not a dispatch_queue. Higher-level APIs and all that.
Call back on whatever queue you're on and leave it up to the receiver to deal with it. This is how callbacks have traditionally worked.
Demand that there be a runloop on the calling thread, and schedule your callbacks on the calling runloop. This is how NSURLConnection historically worked before queues.
Always make your callbacks on one of the well-known queues (particularly the main queue) unless told otherwise. I don't know of anywhere that this is done in UIKit, but I've seen it commonly in app code, and is a very easy approach most of the time.
Create a queue manually and dispatch both your calling code and your dispatch_after code onto that. That way you can guarantee that both pieces of code are run from the same queue.
Having to do this is likely because the need of a hack. You can hack around this with another hack:
id block = ^foo() {
[self doSomething];
usleep(delay_in_us);
[self doSomehingOther];
}
Instead of usleep() you might consider to loop in a run loop.
I would not recommend this "approach" though. The better way is to have some method which takes a queue as parameter and a block as parameter, where the block is then executed on the specified queue.
And, by the way, there are ways during a block executes to check whether it runs on a particular queue - respectively on any of its parent queue, provided you have a reference to that queue beforehand: use functions dispatch_queue_set_specific, and dispatch_get_specific.

NSMutableDictionary - EXC BAD ACCESS - simultaneous read/write

I was hoping for some help with my app.
I have a set up where multiple threads access a shared NSMutableDictionary owned by a singleton class. The threads access the dictionary in response to downloading JSON and processing it. The singleton class is basically preventing duplication of some downloaded objects which have an unique id number.
ie.
//NSURLConnection calls:
[[Singleton sharedInstance] processJSON:data];
#interface Singleton
+ (Singleton) sharedInstance;
#property (nonatomic, strong) NSMutableDictionary *store;
#end
#implementation
-(void) processJSON:(NSData*)data {
...
someCustomClass *potentialEntry = [someCustomClass parse:data];
...
if(![self entryExists:potentialEntry.stringId])
[self addNewEntry:potentialEntry];
...
}
-(void) entryExists:(NSString*)objectId {
if(self.store[objectId])
return true;
else return false;
}
-(void) addEntry:(someCustomClass *object) {
self.store[object.stringId] = object;
}
There can be as many as 5-10 threads at a time calling processJSON at once.
Not immediately but after a few minutes of running (quicker on the iPhone than on the simulator) I get the dreaded EXC BAD ACCESS.
I don't confess to know how NSMutableDictionary works but I would guess that there's some kind of hash table in the background which needs to be updated when assigning objects and read when accessing objects. Therefore, if threads were to instantaneously read/write to a dictionary, this error could occur - may be because an object has moved in memory?
Im hoping that someone with more knowledge on the subject could enlighten me!
As for solutions I was thinking of the singleton class having an NSOperationQueue with a maximum concurrent operation number of 1 and then using operationWithBlock: whenever I want to access the NSDictionary. The only problem is that it makes calling processJSON an asynchronous function and I can't return the created object straight away; I'd have to use a block and that would be a bit messier. Is there any way of using #synchronize? Would that work well?
I'd draw your attention to the Synchronization section of the iOS rendition of the Threading Programming Guide that Hot Licks pointed you to. One of those locking mechanisms, or the use of a dedicated serial queue, can help you achieve thread safety.
Your intuition regarding the serial operation queue is promising, though frequently people will use a serial dispatch queue for this (e.g., so you can call dispatch_sync from any queue to your dictionary's serial queue), achieving both a controlled mechanism for interacting with it as well as synchronous operations. Or, even better, you can use a custom concurrent queue (not a global queue), and perform reads via dispatch_sync and perform writes via dispatch_barrier_async, achieving an efficient reader/writer scheme (as discussed in WWDC 2011 - Mastering GCD or WWDC 2012 - Asynchronous Design Patterns).
The Eliminating Lock-Based Code section of the Concurrency Programming Guide outlines some of the rationale for using a serial queue for synchronization versus the traditional locking techniques.
The Grand Central Dispatch (GCD) Reference and the dispatch queue discussion in the Concurrency Programming Guide should provide quite a bit of information.
the simplest solution is to just put all off the code that accesses the dict in an #synchronized block.
serial operation queues are great, but sounds like overkill to me for this, as you aren't guarding a whole ecosystem of data, just one structure..

How to use GCD for lightweight transactional locking of resources?

I'm trying to use GCD as a replacement for dozens of atomic properties. I remember at WWDC they were talking about that GCD could be used for efficient transactional locking mechanisms.
In my OpenGL ES runloop method I put all drawing code in a block executed by dispatch_sync on a custom created serial queue. The runloop is called by a CADisplayLink which is to my knowledge happening on the main thread.
There are ivars and properties which are used both for drawing but also for controlling what will be drawn. The problem is that there must be some locking in place to prevent concurrency problems, and a way of transactionally querying and modifying the state of the OpenGL ES scene from the main thread between two drawn frames.
I can modify a group of properties in a transactional way with GCD by executing a block on that serial queue.
But it seems I can't read values into the main thread, using GCD, while blocking the queue that executes the drawing code. dispatch_synch doesn't have a return value, but I want to get access to presentation values exactly between the drawing of two frames both for reading and writing.
Is it this barrier thing they were talking about? How does that work?
This is what the async writer / sync reader model was designed to accomplish. Let's say you have an ivar (and for purpose of discussion let's assume that you've gone a wee bit further and encapsulated all your ivars into a single structure, just for simplicity's sake:
struct {
int x, y;
char *n;
dispatch_queue_t _internalQueue;
} myIvars;
Let's further assume (for brevity) that you've initialized the ivars in a dispatch_once() and created the _internalQueue as a serial queue with dispatch_queue_create() earlier in the code.
Now, to write a value:
dispatch_async(myIvars._internalQueue, ^{ myIvars.x = 10; });
dispatch_async(myIvars._internalQueue, ^{ myIvars.n = "Hi there"; });
And to read one:
__block int val; __block char *v;
dispatch_sync(myIvars._internalQueue, ^{ val = myIvars.x; });
dispatch_sync(myIvars._internalQueue, ^{ v = myIvars.n; })
Using the internal queue makes sure everything is appropriately serialized and that writes can happen asynchronously but reads wait for all pending writes to complete before giving you back the value. A lot of "GCD aware" data structures (or routines that have internal data structures) incorporate serial queues as implementation details for just this purpose.
dispatch_sync allows you to specify a second argument as completion block where you can get the values from your serial queue and use them on your main thread.So it would look something like
dispatch_sync(serialQueue,^{
//execute a block
dispatch_async(get_dispatch_main_queue,^{
//use your calculations here
});
});
And serial queues handle the concurrency part themselves. So if another piece is trying to access the same code at the same time it will be handled by the queue itself.Hope this was of little help.

is there a way that the synchronized keyword doesn't block the main thread

Imagine you want to do many thing in the background of an iOS application but you code it properly so that you create threads (for example using GCD) do execute this background activity.
Now what if you need at some point to write update a variable but this update can occur or any of the threads you created.
You obviously want to protect that variable and you can use the keyword #synchronized to create the locks for you but here is the catch (extract from the Apple documentation)
The #synchronized() directive locks a section of code for use by a
single thread. Other threads are blocked until the thread exits the
protected code—that is, when execution continues past the last
statement in the #synchronized() block.
So that means if you synchronized an object and two threads are writing it at the same time, even the main thread will block until both threads are done writing their data.
An example of code that will showcase all this:
// Create the background queue
dispatch_queue_t queue = dispatch_queue_create("synchronized_example", NULL);
// Start working in new thread
dispatch_async(queue, ^
{
// Synchronized that shared resource
#synchronized(sharedResource_)
{
// Write things on that resource
// If more that one thread access this piece of code:
// all threads (even main thread) will block until task is completed.
[self writeComplexDataOnLocalFile];
}
});
// won’t actually go away until queue is empty
dispatch_release(queue);
So the question is fairly simple: How to overcome this ? How can we securely add a locks on all the threads EXCEPT the main thread which, we know, doesn't need to be blocked in that case ?
EDIT FOR CLARIFICATION
As you some of you commented, it does seem logical (and this was clearly what I thought at first when using synchronized) that only two the threads that are trying to acquire the lock should block until they are both done.
However, tested in a real situation, this doesn't seem to be the case and the main thread seems to also suffer from the lock.
I use this mechanism to log things in separate threads so that the UI is not blocked. But when I do intense logging, the UI (main thread) is clearly highly impacted (scrolling is not as smooth).
So two options here: Either the background tasks are too heavy that even the main thread gets impacted (which I doubt), or the synchronized also blocks the main thread while performing the lock operations (which I'm starting reconsidering).
I'll dig a little further using the Time Profiler.
I believe you are misunderstanding the following sentence that you quote from the Apple documentation:
Other threads are blocked until the thread exits the protected code...
This does not mean that all threads are blocked, it just means all threads that are trying to synchronise on the same object (the _sharedResource in your example) are blocked.
The following quote is taken from Apple's Thread Programming Guide, which makes it clear that only threads that synchronise on the same object are blocked.
The object passed to the #synchronized directive is a unique identifier used to distinguish the protected block. If you execute the preceding method in two different threads, passing a different object for the anObj parameter on each thread, each would take its lock and continue processing without being blocked by the other. If you pass the same object in both cases, however, one of the threads would acquire the lock first and the other would block until the first thread completed the critical section.
Update: If your background threads are impacting the performance of your interface then you might want to consider putting some sleeps into the background threads. This should allow the main thread some time to update the UI.
I realise you are using GCD but, for example, NSThread has a couple of methods that will suspend the thread, e.g. -sleepForTimeInterval:. In GCD you can probably just call sleep().
Alternatively, you might also want to look at changing the thread priority to a lower priority. Again, NSThread has the setThreadPriority: for this purpose. In GCD, I believe you would just use a low priority queue for the dispatched blocks.
I'm not sure if I understood you correctly, #synchronize doesn't block all threads but only the ones that want to execute the code inside of the block. So the solution probably is; Don't execute the code on the main thread.
If you simply want to avoid having the main thread acquire the lock, you can do this (and wreck havoc):
dispatch_async(queue, ^
{
if(![NSThread isMainThread])
{
// Synchronized that shared resource
#synchronized(sharedResource_)
{
// Write things on that resource
// If more that one thread access this piece of code:
// all threads (even main thread) will block until task is completed.
[self writeComplexDataOnLocalFile];
}
}
else
[self writeComplexDataOnLocalFile];
});

Resources