I am diving a bit deeper into concurrency and have been reading extensively about GCD and NSOperation. However, a lot of posts like the canonic answer on SO are several years old.
It seemed to me that NSOperation main advantages used to be, at the cost of some performance:
"the way to go" generally for more than a simple dispatch as the highest level abstraction (built atop of GCD)
to make task manipulation (cancellation, etc.) a lot easier
to easily set up dependencies between tasks
Given GCD's DispatchWorkItem & block cancellation / DispatchGroup / qos in particular, is there really an incentive (cost-performance wise) to use NSOperation anymore for concurrency apart from cases where you need to be able to cancel a task when it began executing or query the task state ?
Apple seems to put a lot more emphasis on GCD, at least in their WWDC (granted it's more recent than NSOperation).
I see them each still having their own purpose. I just recently rewatched the 2015 WWDC talk about this (Advanced NSOperations), and I see two main points here.
Run Time & User Interaction
From the talk:
NSOperations run for a little bit longer than you would expect a block to run, so blocks usually take a few nanoseconds, maybe at most a millisecond, to execute.
NSOperations, on the other hand, can be much longer, for anywhere from a couple of milliseconds to even several minutes
The example they talk about is in the WWDC app, where there exists an NSOperation that has a dependency on having a logged in user. The dependency NSOperation presents a login view controller and waits for the user to authenticate. Once finished, that NSOperation finishes and the NSOperationQueue resumes it's work. I don't think you'd want to use GCD for this scenario.
Subclassing
Since NSOperations are just classes, you can subclass them to get more reusability out of them. This isn't possible with GCD.
Example: (Using the WWDC login scenario from above)
You have many NSOperations in your code base that are associated with a user interaction that requires them to be authenticated. (Liking a video, in this example.) You could extend NSOperation to create an AuthenticatedOperation, then have all those NSOperations extend this new class.
First off, NSOperationQueue let you enqueue operations, that is, some sort of asynchronous operations with a start method, a cancel method and a few observable properties, while with a dispatch queue one can submit a block or a closure or a function to a dispatch queue, which will be then executed.
An "Operation" is semantically fundamentally different than a block (or closure, function). An operation has an underlying asynchronous task, while a block (closure or functions) is just that.
What comes close to an NSOperation, though, is an asynchronous function, e.g.:
func asyncTask(param: Param, completion: (T?, Error?) ->())
Now with Futures we can define the same asynchronous function like:
func asyncTask(param: Param) -> Future<T>
which makes such asynchronous functions quite handy.
Since futures have combinator functions like map and flatMap and so on, we can quite easily "emulate" the "dependency" feature of NSOperation, just in a more powerful, more concise and more comprehensible way.
We can also implement some sort of NSOperationQueue with a few lines of code based solely on GCD, say a "TaskQueue" and with basically the same features, like "maxConcurrentTasks" and can use it to enqueue task functions (not operations), in just a more powerful, more concise and a more comprehensible way as well. ;)
In order to get a cancelable operation, you need to create a subclass of NSOperation - while you can create a async function "ad-hod" - inline.
Also, since cancellation is an independent concept, we can assume, that there exists some library whose implementation is solely based on GCD which solves this problem in the, uhm, the usual way ;) It may look like this:
self.cancellationRequest = CancellationRequest()
self.asyncTask(param: param, cancellationToken: cr.token).map { result in
...
}
and later:
override func viewWillDisappear(_ animated: animated) {
super.viewWillDisappear(animated)
self.cancellationRequest.cancel()
}
So, IMHO there's really no reason to use clunky NSOperation and NSOperationQueue, and there's no reason any more for subclassing NSOperation, which is quite elaborate and surprising difficult, unless you don't care about data races.
Related
Sometimes I need to do synchronous return. In dispatch_sync it's just:
__block int foo
dispatch_sync({
foo = 3
});
return foo;
I am not sure how that translates to NSOperationQueue. I have checked the maxConcurrentOperationCount = 1, but I don't think that's blocking. My understanding is that this only makes the operation queue "serial", but not "synchronous".
One can use addoperations:waitUntilFinished: (addOperations(_:waitUntilFinished:) in Swift), passing true for that second parameter.
For the sake of future readers, while one can wait using waitUntilFinished, 9 times out of 10, it is a mistake. If you have got some asynchronous operation, it is generally asynchronous for a good reason, and you should embrace asynchronous patterns and use asynchronous block completion handler parameters in your own code, rather than forcing it to finish synchronously.
E.g., rather than:
- (int)bar {
__block int foo;
NSOperation *operation = [NSBlockOperation blockOperationWithBlock:^{
foo = 3;
}];
[self.queue addOperations:#[operation] waitUntilFinished:true];
return foo;
}
One generally would do:
- (void)barWithCompletion:(void (^ _Nonnull)(int))completion {
[self.queue addOperationWithBlock:^{
completion(3);
}];
}
[self barWithCompletion:^(int value) {
// use `value` here
}];
// but not here, as the above runs asynchronously
When developers first encounter asynchronous patterns, they almost always try to fight them, making them behave synchronously, but it is almost always a mistake. Embrace asynchronous patterns, don't fight them.
FWIW, the new async-await patterns in Swift allow us to write asynchronous code that is far easier to reason about than the traditional asynchronous blocks/closures. So if Swift and its new concurrency patterns are an option, you might consider that as well.
In the comments below, you seem to be concerned about the performance characteristics of synchronous behavior of operation queues versus dispatch queues. One does not use operation queues over dispatch queues for performance reasons, but rather if you benefit from the added features (including dependencies, priorities, controlling degree of concurrency, elegant cancelation features, wrapping asynchronous processes in operations, etc.).
If performance is of optimal concern (e.g., for thread-safe synchronizing), something like an os_unfair_lock will eclipse dispatch queue performance. But we generally don't let performance dictate our choice except for those 3% of the cases where one absolutely needs that. We try to use the highest level of abstraction possible (even at the cost of modest performance hits) suitable for the task at hand.
I don't understand the workings of a DispatchQueue and wanted to learn more about how they implement the foundational queueing theory requirements. I tried to inspect a queue using:
dump(DispatchQueue.global())
And this gave this output:
- <OS_dispatch_queue_global: com.apple.root.default-qos[0x10c041f00] = { xref = -2147483648, ref = -2147483648, sref = 1, target = [0x0], width = 0xfff, state = 0x0060000000000000, in-barrier}> #0
- super: OS_dispatch_queue
- super: OS_dispatch_object
- super: OS_object
- super: NSObject
I got that the label is com.apple.root.default-qos, and this is specified in the Apple docs and the class is the packaged OS_dispatch_queue_global. I understand qos is queryable on the queue itself and that makes sense as well. Width I think just means the allocated memory size.
What I don't understand are the relevances of xref, ref and sref, I think they are internal ids for the queues but I am not sure. I think they are related to fundamental queueing concepts (multithreading came to mind) but would be great to hone into this in more detail.
Is the autoreleaseFrequency hidden from this debug description? Also, what does in-barrier = 0 mean? I tried creating a custom queue and this was replaced by in-flight = 0.. so confused about that as well.
Any ideas on how these undocumented variables relate to queueing theory? I think these are undocumented internals of the API, so any educated and justified explanations would be fine!
Thanks.
Why ask this?
This is a fairly broad question about the internals of grand-central-dispatch. I had difficulty understanding the dumped output because the original WWDC '10 videos and slides for GCD are no longer public. I also didn't know about the open-source libdispatch repo (thanks Rob). That needn't be a problem, but there are no related QAs on SO explaining the topic in detail.
Why GCD?
According to the WWDC '10 GCD transcripts (Thanks Rob), the main idea behind the API was to simplify the boilerplate associated with using the #selector API for multithreading.
Benefits of GCD
Apple released a new block-based API instead of going with function pointers, to also enable type-safe code that wouldn't crash if the block had the wrong type signature. Using typedefs also made code cleaner when used in function parameters, local variables and #property declarations. Queues allow you to capture code and some state as a chunk of data that get managed, enqueued and executed automatically behind the scenes.
The same session mentions how GCD manages low-level threads under the hood. It enqueues blocks to execute on threads when they need to be executed and then releases those threads (PThreads to be precise) when they are no longer referenced. GCD manages threads automatically and doesn't expose this API - when a DispatchWorkItem is dequeued GCD creates a thread for this block to execute on.
Drawbacks of performSelector
performSelector:onThread:withObject:waitUntilDone: has numerous drawbacks that suggest poor design for the modern challenges of concurrency, waiting, synchronisation. leads to pyramids of doom when switching threads in a func. Furthermore, the NSObject.performSelector family of threading methods are inflexible and limited:
No options to optimise for concurrent, initially inactive, or synchronisation on a particular thread. Unlike GCD.
Only selectors can be dispatched on to new threads (awful).
Lots of threads for a given function leads to messy code (pyramids of doom).
No support for queueing without a limited (at the time when GCD was announced in iOS 4) NSOperation API. NSOperations are a high-level, verbose API that became more powerful after incorporating elements of dispatch (low-level API that became GCD) in iOS 4.
Lots of bugs related to unhandled invalid Selector errors (type safety).
DispatchQueue internals
I believe the xref, ref and sref are internal registers that manage reference counts for automatic reference counting. GCD calls dispatch_retain and dispatch_release in most cases when needed, so we don't need to worry about releasing a queue after all its blocks have been executed. However, there were cases when a developer could call retain and release manually when trying to ensure the queue is retained even when not directly in use. These registers allow libDispatch to crash when release is called on a queue with a positive reference count, for better error handling.
When calling a block with DispatchQueue.global().async or similar, I believe this increments the reference count of that queue (xref and ref).
The variables in the question are not documented explicitly, but from what I can tell:
xref counts the number of external references to a general DispatchQueue.
ref counts the total number of references to a general DispatchQueue.
sref counts the number of references to API serial/concurrent/runloop queues, sources and mach channels (these need to be tracked differently as they are represented using different types).
in-barrier looks like an internal state flag (DispatchWorkItemFlag) to track whether new work items submitted to a concurrent queue should be scheduled or not. Only once the barrier work item finishes, the queue returns to scheduling work items that were submitted after the barrier. in-flight means that there is no barrier in force currently.
state is also not documented explicitly but I presume points to memory where the block can access variables from the scope where the block was scheduled.
I'm curious whether those two types to dispatch work to main queue are equivalent or maybe there are some differentials?
dispatch_async(dispatch_get_main_queue()) {
// Do stuff...
}
and
NSOperationQueue.mainQueue().addOperationWithBlock { [weak self] () -> Void in
// Do stuff..
}
There are differences, but they are somewhat subtle.
Operations enqueued to -[NSOperationQueue mainQueue] get executed one operation per pass of the run loop. This means, among other things, that there will be a "draw" pass between operations.
With dispatch_async(dispatch_get_main_queue(),...) and -[performSelectorOnMainThread:...] all enqueued blocks/selectors are called one after the other without spinning the run loop (i.e. allowing views to draw or anything like that). The runloop will continue after executing all enqueued blocks.
So, with respect to drawing, dispatch_async(dispatch_get_main_queue(),...) and -[performSelectorOnMainThread:...] batch operations into one draw pass, whereas -[NSOperationQueue mainQueue] will draw after each operation.
For a full, in-depth investigation of this, see my answer over here.
At a very basic level they are not both the same thing.
Yes, the operation queue method will be scheduled on GCD queue. But it also gets all the rich benefits of using operation queues, such as an easy way to add dependent operations; state observation; the ability to cancel an operation…
So no, they are not equivalent.
Yes there are difference in GCD and NSOperation.
GCD is light weight can be used to give flavor of multithreading like loading profile pic, loading web page, network call that surely returns at earliest.
NSOperation queue 1. Usually used to make heavy network calls, sort thousand's of record etc.2. Can add new operation, delete, get current status at any operation3. Add completion handler4. get operation count etc are added advantages over GCD
Now that dispatch_get_current_queue is deprecated in iOS 6, how do I use dispatch_after to execute something in the current queue?
The various links in the comments don't say "it's better not to do it." They say you can't do it. You must either pass the queue you want or dispatch to a known queue. Dispatch queues don't have the concept of "current." Blocks often feed from one queue to another (called "targeting"). By the time you're actually running, the "current" queue is not really meaningful, and relying on it can (and historically did) lead to dead-lock. dispatch_get_current_queue() was never meant for dispatching; it was a debugging method. That's why it was removed (since people treated it as if it meant something meaningful).
If you need that kind of higher-level book-keeping, use an NSOperationQueue which tracks its original queue (and has a simpler queuing model that makes "original queue" much more meaningful).
There are several approaches used in UIKit that are appropriate:
Pass the call-back dispatch_queue as a parameter (this is probably the most common approach in new APIs). See [NSURLConnection setDelegateQueue:] or addObserverForName:object:queue:usingBlock: for examples. Notice that NSURLConnection expects an NSOperationQueue, not a dispatch_queue. Higher-level APIs and all that.
Call back on whatever queue you're on and leave it up to the receiver to deal with it. This is how callbacks have traditionally worked.
Demand that there be a runloop on the calling thread, and schedule your callbacks on the calling runloop. This is how NSURLConnection historically worked before queues.
Always make your callbacks on one of the well-known queues (particularly the main queue) unless told otherwise. I don't know of anywhere that this is done in UIKit, but I've seen it commonly in app code, and is a very easy approach most of the time.
Create a queue manually and dispatch both your calling code and your dispatch_after code onto that. That way you can guarantee that both pieces of code are run from the same queue.
Having to do this is likely because the need of a hack. You can hack around this with another hack:
id block = ^foo() {
[self doSomething];
usleep(delay_in_us);
[self doSomehingOther];
}
Instead of usleep() you might consider to loop in a run loop.
I would not recommend this "approach" though. The better way is to have some method which takes a queue as parameter and a block as parameter, where the block is then executed on the specified queue.
And, by the way, there are ways during a block executes to check whether it runs on a particular queue - respectively on any of its parent queue, provided you have a reference to that queue beforehand: use functions dispatch_queue_set_specific, and dispatch_get_specific.
Essentially, I have a set of data in an NSDictionary, but for convenience I'm setting up some NSArrays with the data sorted and filtered in a few different ways. The data will be coming in via different threads (blocks), and I want to make sure there is only one block at a time modifying my data store.
I went through the trouble of setting up a dispatch queue this afternoon, and then randomly stumbled onto a post about #synchronized that made it seem like pretty much exactly what I want to be doing.
So what I have right now is...
// a property on my object
#property (assign) dispatch_queue_t matchSortingQueue;
// in my object init
_sortingQueue = dispatch_queue_create("com.asdf.matchSortingQueue", NULL);
// then later...
- (void)sortArrayIntoLocalStore:(NSArray*)matches
{
dispatch_async(_sortingQueue, ^{
// do stuff...
});
}
And my question is, could I just replace all of this with the following?
- (void)sortArrayIntoLocalStore:(NSArray*)matches
{
#synchronized (self) {
// do stuff...
};
}
...And what's the difference between the two anyway? What should I be considering?
Although the functional difference might not matter much to you, it's what you'd expect: if you #synchronize then the thread you're on is blocked until it can get exclusive execution. If you dispatch to a serial dispatch queue asynchronously then the calling thread can get on with other things and whatever it is you're actually doing will always occur on the same, known queue.
So they're equivalent for ensuring that a third resource is used from only one queue at a time.
Dispatching could be a better idea if, say, you had a resource that is accessed by the user interface from the main queue and you wanted to mutate it. Then your user interface code doesn't need explicitly to #synchronize, hiding the complexity of your threading scheme within the object quite naturally. Dispatching will also be a better idea if you've got a central actor that can trigger several of these changes on other different actors; that'll allow them to operate concurrently.
Synchronising is more compact and a lot easier to step debug. If what you're doing tends to be two or three lines and you'd need to dispatch it synchronously anyway then it feels like going to the effort of creating a queue isn't worth it — especially when you consider the implicit costs of creating a block and moving it over onto the heap.
In the second case you would block the calling thread until "do stuff" was done. Using queues and dispatch_async you will not block the calling thread. This would be particularly important if you call sortArrayIntoLocalStore from the UI thread.