How does a serial dispatch queue guarantee resource protection? - ios

//my_serial_queue is a serial_dispatch_queue
dispatch_async(my_serial_queue, ^{
//access a shared resource such as a bank account balance
[self changeBankAccountBalance];
});
If I submit 100 tasks that each access and mutate a bank account balance, I understand that a serial queue will execute each task sequentially, but do these tasks finish sequentially as well when using dispatch_async?
What if task #23 that I submit asynchronously to the serial queue takes a really long time to finish? Would task #24 start only when task #23 is done, or would task #24 begin before task #23 is done? If so, couldn't task #24 have the wrong bank account balance when it starts its job, thereby screwing up data integrity?
Thanks!!

Yes, a dedicated serial queue is a wonderful way to synchronize access to some resource shared amongst multiple threads. And, yes, with a serial queue, each task will wait for the prior one to complete.
Two observations:
While this sounds like a highly inefficient process, this is implicitly at the heart of any synchronization technique (whether queue-based or lock-based approach) whose goal is to minimize concurrent updates of a shared resource.
But in many scenarios the serial queue technique can yield significantly better performance than than other common techniques, such as simple mutex lock, NSLock, or the #synchronized directive. For a discussion on the alternative synchronization techniques, see the Synchronization section of the Threading Programming Guide. For a discussion about using queues in lieu of locks, see the Eliminating Lock-Based Code in the Migrating Away from Threads section of the Concurrency Programming Guide.
A variation of the serial queue pattern is to use the "reader-writer" pattern, where you create a GCD concurrent queue:
queue = dispatch_queue_create("identifier", DISPATCH_QUEUE_CONCURRENT);
You then perform reads using dispatch_sync, but you perform writes using dispatch_barrier_async. The net effective is to permit concurrent read operations, but ensure that writes are never performed concurrently.
If your resource permits concurrent reads, then the reader-writer pattern can offer further performance gain over that of a serial queue.
So, in short, while it seems inefficient to have task #24 wait for task #23, that is inherent in any synchronization technique where you strive to minimize concurrent updates of the shared resource. And GCD serial queues are a surprisingly efficient mechanism, often better than many simple locking mechanisms. The reader-writer pattern can, in some circumstance, offer even further performance improvements.
My original answer, below, was in response to the original question which was confusing titled "how does a serial dispatch queue guarantee concurrency?" In retrospect, this was just an accidental use of the wrong terms.
This is an interesting choice of words, "how does a serial dispatch queue guarantee concurrency?"
There are three types of queues, serial, concurrent, and the main queue. A serial queue will, as the name suggests, not start the next dispatched block until the prior one finished. (Using your example, this means that if task 23 takes a long time, it won't start task 24 until it's done.) Sometimes this is critical (e.g. if task 24 is dependent upon the results of task 23 or if both task 23 and 24 are trying to access the same shared resource).
If you want these various dispatched tasks to run concurrently with respect to each other, you use a concurrent queue (either one of the global concurrent queues that you get via dispatch_get_global_queue, or you can create your own concurrent queue using dispatch_queue_create with the DISPATCH_QUEUE_CONCURRENT option). In a concurrent queue, many of your dispatched tasks may run concurrently. Using concurrent queues requires some care (notably the synchronization of shared resources), but can yield significant performance benefits when implemented properly.
And as a compromise between these two approaches, you can use operation queues, which can be both concurrent, but in which you can also constrain how many of the operations on the queue will run concurrently at any given time by setting maxConcurrentOperationCount. A typical scenario where you'll use this is when doing background network tasks, where you do not want more than five concurrent network requests.
For more information, see the Concurrency Programming Guide.

man dispatch_queue_create says: “All memory writes performed by a block dispatched to a serial queue are guaranteed to be visible to subsequent blocks dispatched to the same queue.” Thus, serial queues are a good way to serializing access to mutable state to avoid race conditions.
do these tasks finish sequentially as well when using dispatch_async?
Yes. The queue dictates the execution policy, not how you queue the block.
In other words, if the queue is serial, queueing with async or sync doesn't change that. The only difference is: do I wait for this block to complete before continuing execution of the rest of the program? dispatch_async=no, dispatch_sync=yes.
What if task #23 that I submit asynchronously to the serial queue
takes a really long time to finish?
Nothing changes. A serial queue always waits for the previously dequeued block (#23) to complete before dequeuing and executing the next block (#24). If stalling the queue is a concern, you should implement a timeout inside your block code.

Related

Confusing with technical terminology between Thread and Queue

I am newer to iPhone development. With focusing on develop multi-threading Apps, I have referred to some Apple documents and others explaining thread and multi-threading concepts. But as far as work-queue is concerned, I am confused in understanding relation between Thread, Task and Queue. Some documents says, a thread can have multiple tasks and those tasks is stored in queue, thus each thread may have its own queue. Whereas, some says that, threads themselves are stored in queue.
My question is, can we say:
(1) Threads can have multiple Tasks and those task are stored and managed inside a Queue of that thread.
OR
(2) Threads themselves are stored and managed inside a Queue.
Secondly, I also read something like this:
Another advantage of using a thread pool over creating a new thread for each task is thread creation and destruction overhead is negated.
Is thread pool synonym for work-queue?
I am clear now.
Thread and queue is totally different things. Thread is separate code of execution while queue is data structure for maintaining tasks. A thread can have multiple tasks and all threads may be created for particular goal that is Process. A thread has its own space in memory for its variables and other things.
So, in multi-threading programming, queue is a mechanism that handle sequence of tasks to be executed. Queue always executes tasks in serial order. However, if we want to execute tasks concurrently, we would have to create concurrent queues. Thus multiple queues can executed concurrently benefiting multiprogramming. With latest API, it is upto operating system that how to schedule those tasks. Tasks may or may not be executed on separate thread. Structure always depends on our requirement.

GCD and Threads

I want to understand something about GCD and Threads.
I have a for loop in my view controller which asks my model to do some async network request.
So if the loop runs 5 times, the model sends out 5 network requests.
Is it correct to state that 5 threads have been created by my model considering the fact that I'm using NSURLConnection's sendAsyncRequest and the completion handlers will be called on an additional 5 threads ?
Now, If I ask my view controller to execute this for loop on a different thread and in every iteration of the loop, the call to the model should be dependent on the previous iteration, would I be creating an "Inception" of threads here ?
Basically, I want the subsequent async requests to my server only if the previous thread has completed entirely (By entirely I mean all of its sub threads should have finished executing too.)
I can't even frame the question properly because I'm massively confused myself.
But if anybody could help with anything, that would be helpful.
It is not correct to state that five threads have been created in the general case.
There is no one-to-one mapping between threads and blocks. GCD is an implementation of thread pooling.
A certain number of threads are created according to the optimal setup for that device — the cost of creating and maintaing threads under that release of the OS, the number of processor cores available, the number of threads it already has but which are presently blocked and any other factors Apple cares to factor in may all be relevant.
GCD will then spread your blocks over those threads. Or it may create new threads. But it won't necessarily.
Beyond that queues are just ways of establishing the sequencing between blocks. A serial dispatch queue does not necessarily own its own thread. All concurrent dispatch queues do not necessarily own their own threads. But there's no reason to believe that any set of queues shares any threads.
The exact means of picking threads for blocks has changed between versions of the OS. E.g. iOS 4 was highly profligate in thread creation, in a way that iOS 5+ definitely haven't been.
GCD will just try to do whatever is best in the circumstances. Don't waste your time trying to second guess it.
"Basically, I want the subsequent async requests to my server only if the previous thread has completed entirely (By entirely I mean all of its sub threads should have finished executing too.)"
Only focusing on the above statement to avoid confusion. Simple solution would be create a queue. feed the queue with 5 loops. Each loop will be making network request synchronously(you can use sendSynchronousRequest: method available in NSURLConnection), performing the operations after request completion and then start the next loop. queue following FIFO will execute the your requests subsequently.
GCD : Think of this as a simple queue that can accept tasks. Tasks are blocks of your code. You can put in as many tasks as you want in a queue (permitting system limits). Queues come in different flavors. Concurrent vs Serial. Main vs Global. High Priority vs Low Priority. A queue is not a thread.
Thread : It is a single line of execution of code in sequence. You can have multiple threads working on your code at the same time. A thread is not a queue.
Once you separate the 2 entities things start become clear.
GCD basically uses the threads in the process to work on tasks. In a serial queue everything is processed in sequence. So you don't need to have synchronization mechanisms in your code, the very nature of serial queue ensures synchronization. If this is a concurrent queue (i.e. 2 or more tasks being processed at the same time, then you need to ensure critical sections of your code are protected with synchronization).
Here is how you queue work to be done.
dispatch_async(_yourDispatchQueue, ^() {
NSLog (#"work queued");
});
The above NSLog will now get executed in a background thread in a near future time, but in a background thread.
If you notice when we put a request in we use dispatch_async. The other variation is dispatch_sync. The different between the 2 is after you put the request in the queue, the async variation will move on. The sync variation will not !!
If you are going to use a GCD for NSURLConnection you need to be careful on which thread you start the connection. Here is a SO link for more info. GCD with NSURLConnection

If I want a task to run in the background, how does the "dispatch_get_global_queue" queue work?

When selecting which queue to run dispatch_async on, dispatch_get_global_queue is mentioned a lot. Is this one special background queue that delegates tasks to a certain thread? Is it almost a singleton?
So if I use that queue always for my dispatch_async calls, will that queue get full and have to wait for things to finish before another one can start, or can it assign other tasks to different threads?
I guess I'm a little confused because when I'm choosing the queue for an NSOperation, I can choose the queue for the main thread with [NSOperationQueue mainQueue], which seems synonymous to dispatch_get_main_queue but I was under the impression background queues for NSOperation had to be individually made instances of NSOperationQueue, yet GCD has a singleton for a background queue? (dispatch_get_global_queue)
Furthermore - silly question but wanted to make sure - if I put a task in a queue, the queue is assigned to one thread, right? If the task is big enough it won't split it up over multiple threads, will it?
When selecting which queue to run dispatch_async on,
dispatch_get_global_queue is mentioned a lot. Is this one special
background queue that delegates tasks to a certain thread?
A certain thread? No. dispatch_get_global_queue retrieves for you a global queue of the requested relative priority. All queues returned by dispatch_get_global_queue are concurrent, and may, at the system's discretion, dispatch work to many different threads. The mechanics of this are an implementation detail that is opaque to you as a consumer of the API.
In practice, and at the risk of oversimplifying it, there is one global queue for each priority level, and at the time of this writing, based on my experience, each of those will at any given time be dispatching work to between 0 and 64 threads.
Is it almost a singleton?
Strictly no, but you can think of them as singletons where there is one singleton per priority level.
So if I use that queue always for my dispatch_async calls, will that
queue get full and have to wait for things to finish before another
one can start, or can it assign other tasks to different threads?
It can get full. Practically speaking, if you are saturating one of the global concurrent queues (i.e. more than 64 background tasks of the same priority in flight at the same time), you probably have a bad design. (See this answer for more details on queue width limits)
I guess I'm a little confused because when I'm choosing the queue for
an NSOperation, I can choose the queue for the main thread with
[NSOperationQueue mainQueue], which seems synonymous to
dispatch_get_main_queue
They are not strictly synonymous. Although NSOperationQueue uses GCD under the hood, there are some important differences. For instance, in a single pass of the main run loop, only one operation enqueued to +[NSOperationQueue mainQueue] will be executed, whereas more than one block submitted to dispatch_get_main_queue might be executed on a single run loop pass. This probably doesn't matter to you, but they are not, strictly speaking, the same thing.
but I was under the impression background
queues for NSOperation had to be individually made instances of
NSOperationQueue, yet GCD has a singleton for a background queue?
(dispatch_get_global_queue)
In short, yes. It sounds like you're conflating GCD and NSOperationQueue. NSOperationQueue is not just a "trivial wrapper" around GCD, it's its own thing. The fact that it's implemented on top of GCD should not really matter to you. NSOperationQueue is a task queue, with an explicitly settable width, that you can create instances of "at will." You can make as many of them as you like. At some point, all instances of NSOperationQueue are, when executing NSOperations, pulling resources from the same pool of system resources as the rest of your process, including GCD, so yes, there are some interactions there, but they are opaque to you.
Furthermore - silly question but wanted to make sure - if I put a task
in a queue, the queue is assigned to one thread, right? If the task is
big enough it won't split it up over multiple threads, will it?
A single task can only ever be executed on a single thread. There's not some magical way that the system would have to "split" a monolithic task into subtasks. That's your job. With regard to your specific wording, the queue isn't "assigned to one thread", the task is. The next task from the queue to be executed might be executed on a completely different thread.

Concurrency and synchronous execution

I was reading OReilly's iOS6 Programming Cookbook and am confused about something. Quoting from page 378, chapter 6 "Concurrency":
For any task that doesn’t involve the UI, you can use global concurrent queues in GCD.
These allow either synchronous or asynchronous execution. But synchronous execution
does not mean your program waits for the code to finish before continuing. It
simply means that the concurrent queue will wait until your task has finished before it
continues to the next block of code on the queue. When you put a block object on a
concurrent queue, your own program always continues right away without waiting for
the queue to execute the code. This is because concurrent queues, as their name implies,
run their code on threads other than the main thread.
I bolded the text that intrigues me. I think it is false because as I've just learned today synchronous execution means precisely that the program waits for the code to finish before continuing.
Is this correct or how does it really work?
How is this paragraph wrong? Let us count the ways:
For any task that doesn’t involve the UI, you can use global
concurrent queues in GCD.
This is overly specific and inaccurate. Certain UI centric tasks, such as loading images, could be done off the main thread. This would be better said as "In most cases, don't interact with UIKit classes except from the main thread," but there are exceptions (for instance, drawing to a UIGraphicsContext is thread-safe as of iOS 4, IIRC, and drawing is a great example of a CPU intensive task that could be offloaded to a background thread.) FWIW, any work unit you can submit to a global concurrent queue you can also submit to a private concurrent queue too.
These allow either synchronous or asynchronous execution. But
synchronous execution does not mean your program waits for the code to
finish before continuing. It simply means that the concurrent queue
will wait until your task has finished before it continues to the next
block of code on the queue.
As iWasRobbed speculated, they appear to have conflated sync/async work submission with serial/concurrent queues. Synchronous execution does, by definition, mean that your program waits for the code to return before continuing. Asynchronous execution, by definition, means that your program does not wait. Similarly, serial queues only execute one submitted work unit at a time, executing each in FIFO order. Concurrent queues, private or global, in the general case (more in a second), schedule submitted blocks for execution, in the order in which they were enqueued, on one or more background threads. The number of background threads used is an opaque implementation detail.
When you put a block object on a concurrent queue, your own program
always continues right away without waiting for the queue to execute
the code.
Nope. Not true. Again, they're mixing up sync/async and serial/concurrent. I suspect what they're trying to say is: When you enqueue a block asynchronously, your own program always continues right away without waiting for the queue to execute the code.
This is because concurrent queues, as their name implies, run their code on threads other than the main thread.
This is also not correct. For instance, if you have a private concurrent queue that you are using to act as a reader/writer lock that protects some mutable state, if you dispatch_sync to that queue from the main thread, your code will, in many cases, execute on the main thread.
Overall, this whole paragraph is really pretty horrible and misleading.
EDIT: I mentioned this in a comment on another answer, but it might be helpful to put it here for clarity. The concept of "Synchronous vs. Asynchronous dispatch" and the concept of "Serial vs. Concurrent queues" are largely orthogonal. You can dispatch work to any queue (serial or concurrent) in either a synchronous or asynchronous way. The sync/async dichotomy is primarily relevant to the "dispatch*er*" (in that it determines whether the dispatcher is blocked until completion of the block or not), whereas the Serial/Concurrent dichotomy is primarily relevant to the dispatch*ee* block (in that it determines whether the dispatchee is potentially executing concurrently with other blocks or not).
I think that bit of text is poorly written, but they are basically explaining the difference between execution on a serial queue with execution on a concurrent queue. A serial queue is run on one thread so it doesn't have a choice but to execute one task at a time, whereas a concurrent queue can use one or more threads.
Serial queue's execute one task after the next in the order in which they were put into the queue. Each task has to wait for the prior task to be executed before it can then be executed (i.e. synchronously).
In a concurrent queue, tasks can be run at the same time that other tasks are also run since they normally use multiple threads (i.e. asynchronously), but again they are still executed in the order they were enqueued and they can effectively be finished in any order. If you use NSOperation, you can also set up dependencies on a concurrent queue to ensure that certain tasks are executed before other tasks are.
More info:
https://developer.apple.com/library/ios/documentation/General/Conceptual/ConcurrencyProgrammingGuide/OperationQueues/OperationQueues.html
The author is Vandad Nahavandipoor, I don't want to affect this guy's sales income, but all his books contain the same mistakes in the concurrency chapters:
http://www.amazon.com/Vandad-Nahavandipoor/e/B004JNSV7I/ref=sr_tc_2_rm?qid=1381231858&sr=8-2-ent
Which is ironic since he's got a 50-page book exactly on this subject.
http://www.amazon.com/Concurrent-Programming-Mac-iOS-Performance/dp/1449305636/ref=la_B004JNSV7I_1_6?s=books&ie=UTF8&qid=1381232139&sr=1-6
People should STOP reading this guy's books.
When you put a block object on a concurrent queue, your own program
always continues right away without waiting for the queue to execute
the code. This is because concurrent queues, as their name implies,
run their code on threads other than the main thread.
I find it confusing, and the only explanation I can think of, is that she is talking about who blocks who. From man dispatch_sync:
Conceptually, dispatch_sync() is a convenient wrapper around
dispatch_async() with the addition of a semaphore to wait for
completion of the block, and a wrapper around the block to signal its
completion.
So execution returns to your code right away, but the next thing the dispatch_sync does after queueing the block, is wait on a semaphore until the block is executed. Your code blocks because it chooses to.
The other way your code would block is when the queue chooses to run a block using your thread (the one from where you executed dispatch_sync). In this case, your code wouldn't recover control until the block is executed, so the check on the semaphore would always find the block is done.
Erica Sadun for sure knows better than me, so maybe I'm missing some nuance here, but this is my understanding.

What is the difference between dispatch_get_global_queue and dispatch_queue_create?

I'm writing a moderately complex iOS program that needs to have multiple threads for some of its longer operations (parsing, connections to the network...etc). However, I'm confused as to what the difference is between dispatch_get_global_queue and dispatch_queue_create.
Which one should I use and could you give me a simple explanation of what the difference is in general? Thanks.
As the documentation describes, a global queue is good for concurrent tasks (i.e. you're going to dispatch various tasks asynchronously and you're perfectly happy if they run concurrently) and if you don't want to encounter the theoretical overhead of creating and destroying your own queue.
The creating of your own queue is very useful if you need a serial queue (i.e. you need the dispatched blocks to be executed one at a time). This can be useful in many scenarios, such as when each task is dependent upon the preceding one or when coordinating interaction with some shared resource from multiple threads.
Less common, but you will also want to create your own queue if you need to use barriers in conjunction with a concurrent queue. In that scenario, create a concurrent queue (i.e. dispatch_queue_create with the DISPATCH_QUEUE_CONCURRENT option) and use the barriers in conjunction with that queue. You should never use barriers on global queues.
My general counsel is if you need a serial queue (or need to use barriers), then create a queue. If you don't, go ahead and use the global queue and bypass the overhead of creating your own.
If you want a concurrent queue, but want to control how many operations can run concurrently, you can also consider using NSOperationQueue which has a maxConcurrentOperationCount property. This can be useful when doing network operations and you don't want too many concurrent requests being submitted to your server.
Just posted in a different answer, but here is something I wrote quite a while back:
The best way to conceptualize queues is to first realize that at the very low-level, there are only two types of queues: serial and concurrent.
Serial queues are monogamous, but uncommitted. If you give a bunch of tasks to each serial queue, it will run them one at a time, using only one thread at a time. The uncommitted aspect is that serial queues may switch to a different thread between tasks. Serial queues always wait for a task to finish before going to the next one. Thus tasks are completed in FIFO order. You can make as many serial queues as you need with dispatch_queue_create.
The main queue is a special serial queue. Unlike other serial queues, which are uncommitted, in that they are "dating" many threads but only one at time, the main queue is "married" to the main thread and all tasks are performed on it. Jobs on the main queue need to behave nicely with the runloop so that small operations don't block the UI and other important bits. Like all serial queues, tasks are completed in FIFO order.
If serial queues are monogamous, then concurrent queues are promiscuous. They will submit tasks to any available thread or even make new threads depending on system load. They may perform multiple tasks simultaneously on different threads. It's important that tasks submitted to the global queue are thread-safe and minimize side-effects. Tasks are submitted for execution in FIFO order, but order of completion is not guaranteed. As of this writing, there are only three concurrent queues and you can't make them, you can only fetch them with dispatch_get_global_queue.
edit: blog post expanding on this answer: http://amattn.com/p/grand_central_dispatch_gcd_summary_syntax_best_practices.html
One returns the existing global queue, the other creates a new one. Instead of using GCD, I would consider using NSOperation and operation queue. You can find more information about it in this guide. Typically, of you want the operations to execute concurrently, you want to create your own queue and put your operations in it.

Resources