When to use Semaphore instead of Dispatch Group? - ios

I would assume that I am aware of how to work with DispatchGroup, for understanding the issue, I've tried:
class ViewController: UIViewController {
override func viewDidLoad() {
super.viewDidLoad()
performUsingGroup()
}
func performUsingGroup() {
let dq1 = DispatchQueue.global(qos: .userInitiated)
let dq2 = DispatchQueue.global(qos: .userInitiated)
let group = DispatchGroup()
group.enter()
dq1.async {
for i in 1...3 {
print("\(#function) DispatchQueue 1: \(i)")
}
group.leave()
}
group.wait()
dq2.async {
for i in 1...3 {
print("\(#function) DispatchQueue 2: \(i)")
}
}
group.notify(queue: DispatchQueue.main) {
print("done by group")
}
}
}
and the result -as expected- is:
performUsingGroup() DispatchQueue 1: 1
performUsingGroup() DispatchQueue 1: 2
performUsingGroup() DispatchQueue 1: 3
performUsingGroup() DispatchQueue 2: 1
performUsingGroup() DispatchQueue 2: 2
performUsingGroup() DispatchQueue 2: 3
done by group
For using the Semaphore, I implemented:
func performUsingSemaphore() {
let dq1 = DispatchQueue.global(qos: .userInitiated)
let dq2 = DispatchQueue.global(qos: .userInitiated)
let semaphore = DispatchSemaphore(value: 1)
dq1.async {
semaphore.wait()
for i in 1...3 {
print("\(#function) DispatchQueue 1: \(i)")
}
semaphore.signal()
}
dq2.async {
semaphore.wait()
for i in 1...3 {
print("\(#function) DispatchQueue 2: \(i)")
}
semaphore.signal()
}
}
and called it in the viewDidLoad method. The result is:
performUsingSemaphore() DispatchQueue 1: 1
performUsingSemaphore() DispatchQueue 1: 2
performUsingSemaphore() DispatchQueue 1: 3
performUsingSemaphore() DispatchQueue 2: 1
performUsingSemaphore() DispatchQueue 2: 2
performUsingSemaphore() DispatchQueue 2: 3
Conceptually, both of DispachGroup and Semaphore serve the same purpose (unless I misunderstand something).
Honestly, I am unfamiliar with: when to use the Semaphore, especially when workin with DispachGroup -probably- handles the issue.
What is the part that I am missing?

Conceptually, both of DispatchGroup and Semaphore serve the same purpose (unless I misunderstand something).
The above is not exactly true. You can use a semaphore to do the same thing as a dispatch group but it is much more general.
Dispatch groups are used when you have a load of things you want to do that can all happen at once, but you need to wait for them all to finish before doing something else.
Semaphores can be used for the above but they are general purpose synchronisation objects and can be used for many other purposes too. The concept of a semaphore is not limited to Apple and can be found in many operating systems.
In general, a semaphore has a value which is a non negative integer and two operations:
wait If the value is not zero, decrement it, otherwise block until something signals the semaphore.
signal If there are threads waiting, unblock one of them, otherwise increment the value.
Needless to say both operations have to be thread safe. In olden days, when you only had one CPU, you'd simply disable interrupts whilst manipulating the value and the queue of waiting threads. Nowadays, it is more complicated because of multiple CPU cores and on chip caches etc.
A semaphore can be used in any case where you have a resource that can be accessed by at most N threads at the same time. You set the semaphore's initial value to N and then the first N threads that wait on it are not blocked but the next thread has to wait until one of the first N threads has signaled the semaphore. The simplest case is N = 1. In that case, the semaphore behaves like a mutex lock.
A semaphore can be used to emulate a dispatch group. You start the sempahore at 0, start all the tasks - tracking how many you have started and wait on the semaphore that number of times. Each task must signal the semaphore when it completes.
However, there are some gotchas. For example, you need a separate count to know how many times to wait. If you want to be able to add more tasks to the group after you have started waiting, the count can only be updated in a mutex protected block and that may lead to problems with deadlocking. Also, I think the Dispatch implementation of semaphores might be vulnerable to priority inversion. Priority inversion occurs when a high priority thread waits for a resource that a low priority has grabbed. The high priority thread is blocked until the low priority thread releases the resource. If there is a medium priority thread running, this may never happen.
You can pretty much do anything with a semaphore that other higher level synchronisation abstractions can do, but doing it right is often a tricky business to get right. The higher level abstractions are (hopefully) carefully written and you should use them in preference to a "roll your own" implementation with semaphores, if possible.

Semaphores and groups have, in a sense, opposite semantics. Both maintain a count. With a semaphore, a wait is allowed to proceed when the count is non-zero. With a group, a wait is allowed to proceed when the count is zero.
A semaphore is useful when you want to set a maximum on the number of threads operating on some shared resource at a time. One common use is when the maximum is 1 because the shared resource requires exclusive access.
A group is useful when you need to know when a bunch of tasks have all been completed.

Use a semaphore to limit the amount of concurrent work at a given time. Use a group to wait for any number of concurrent work to finish execution.
In case you wanted to submit three jobs per queue it should be
import Foundation
func performUsingGroup() {
let dq1 = DispatchQueue(label: "q1", attributes: .concurrent)
let dq2 = DispatchQueue(label: "q2", attributes: .concurrent)
let group = DispatchGroup()
for i in 1...3 {
group.enter()
dq1.async {
print("\(#function) DispatchQueue 1: \(i)")
group.leave()
}
}
for i in 1...3 {
group.enter()
dq2.async {
print("\(#function) DispatchQueue 2: \(i)")
group.leave()
}
}
group.notify(queue: DispatchQueue.main) {
print("done by group")
}
}
performUsingGroup()
RunLoop.current.run(mode: RunLoop.Mode.default, before: Date(timeIntervalSinceNow: 1))
and
import Foundation
func performUsingSemaphore() {
let dq1 = DispatchQueue(label: "q1", attributes: .concurrent)
let dq2 = DispatchQueue(label: "q2", attributes: .concurrent)
let semaphore = DispatchSemaphore(value: 1)
for i in 1...3 {
dq1.async {
_ = semaphore.wait(timeout: DispatchTime.distantFuture)
print("\(#function) DispatchQueue 1: \(i)")
semaphore.signal()
}
}
for i in 1...3 {
dq2.async {
_ = semaphore.wait(timeout: DispatchTime.distantFuture)
print("\(#function) DispatchQueue 2: \(i)")
semaphore.signal()
}
}
}
performUsingSemaphore()
RunLoop.current.run(mode: RunLoop.Mode.default, before: Date(timeIntervalSinceNow: 1))

The replies above by Jano and Ken are correct regarding 1) the use of semaphore to limit the amount of work happening at once 2) the use of a dispatch group so that the group will be notified when all the tasks in the group are done. For example, you may want to download a lot of images in parallel but since you know that they are heavy images, you want to limit to two downloads only at a single time so you use a semaphore. You also want to be notified when all the downloads (say there are 50 of them) are done, so you use DispatchGroup. Thus, it is not a matter of choosing between the two. You may use one or both in the same implementation depending on your goals. This type of example was provided in the Concurrency tutorial on Ray Wenderlich's site:
let group = DispatchGroup()
let queue = DispatchQueue.global(qos: .utility)
let semaphore = DispatchSemaphore(value: 2)
let base = "https://yourbaseurl.com/image-id-"
let ids = [0001, 0002, 0003, 0004, 0005, 0006, 0007, 0008, 0009, 0010, 0011, 0012]
var images: [UIImage] = []
for id in ids {
guard let url = URL(string: "\(base)\(id)-jpeg.jpg") else { continue }
semaphore.wait()
group.enter()
let task = URLSession.shared.dataTask(with: url) { data, _, error in
defer {
group.leave()
semaphore.signal()
}
if error == nil,
let data = data,
let image = UIImage(data: data) {
images.append(image)
}
}
task.resume()
}

One typical semaphore use case is a function that can be called simultaneously from different threads and uses a resource that should not be called from multiple threads at the same time:
func myFunction() {
semaphore.wait()
// access the shared resource
semaphore.signal()
}
In this case you will be able to call myFunction from different threads but they won't be able to reach the locked resource simultaneously. One of them will have to wait until the second one finishes its work.
A semaphore keeps a count, therefore you can actually allow for a given number of threads to enter your function at the same time.
Typical shared resource is the output to a file.
A semaphore is not the only way to solve such problems. You can also add the code to a serial queue, for example.
Semaphores are low level primitives and most likely they are used a lot under the hood in GCD.
Another typical example is the producer-consumer problem, where the signal and wait calls are actually part of two different functions. One which produces data and one which consumes them.

Generally semaphore can be considered mainly that we can solve the critical section problem. Locking certain resource to achieve synchronisation. Also what happens if sleep() is invoked, can we achieve the same thing by using a semaphore ?
Dispatch groups we will use when we have multiple group of operations to be carried out and we need a tracking or set dependencies each other or notification when a group os tasks finishes its execution.

Related

GCD serial async queue vs serial sync queue nested in async

I have to protect a critical section of my code.
I don't want the caller to be blocked by the function that can be time consuming so I'm creating a serial queue with background qos and then dispatching asynchronously:
private let someQueue = DispatchQueue(label: "\(type(of: self)).someQueue", qos: .background)
func doSomething() {
self.someQueue.async {
//critical section
}
}
For my understanding, the function will directly return on the calling thread without blocking.
I've also seen somewhere dispatching first asynchronously on the global queue, the synchronously on a serial queue:
private let someQueue2 = DispatchQueue(label: "\(type(of: self)).someQueue2")
func doSomething() {
DispatchQueue.global(qos: .background).async {
self.someQueue2.sync {
//critical section
}
}
}
What's the difference between the two approaches?
Which is the right approach?
In the first approach, the calling thread is not blocked and the task (critical section) passed in the async block will be executed in background.
In the second approach, the calling thread is not blocked, but the "background" thread will be waiting for the sync block (critical section) execution which is executed by another thread.
I don't know what you do in your critical section, but it seems first approach seems the best one. Note that background qos is quite slow, maybe use default qos for your queue, unless you know what you are doing. Also note that convention wants that you use bundle identifier as label for your queue. So something like this:
private let someQueue = DispatchQueue(label: "\(Bundle.main.bundleIdentifier ?? "").\(type(of: self)).someQueue")

Restrict count operations in OperationQueue

I need to restrict count of operations in a concurrent operations queue. To achieve it I use operationsCount property, and waitUntilAllOperationsAreFinished.
func exec(_ block: #escaping ()-> Void) {
self.queue.addOperation(block)
if (self.queue.operationCount == self.queue.maxConcurrentOperationCount)
{
self.wait()
}
}
func wait() {
self.queue.waitUntilAllOperationsAreFinished()
}
In iOS 13 operationsCount is deprecated.
I know that I can use DispatchSemaphore with max count concurrent operations as value:
let semaphore = DispatchSemaphore(value: 10)
<...>
semaphore.wait(timeout: DISPATCH_TIME_FOREVER)
async{
work()
semaphore.signal()
}
But in my tests wait and signal do not have enough performance. How I can get operations count in iOS 13?
You can use maxConcurrentOperationCount and spare yourself all the headache of semaphores. Limiting maxConcurrentOperationCount you can add as much operations to the queue as you want, but only the number you've specified will be performed concurrently.
Take a look at the documentation: https://developer.apple.com/documentation/foundation/operationqueue/1414982-maxconcurrentoperationcount.

What's the difference between operation and block in NSOperation?

When I use following codes:
let queue = OperationQueue()
let operation = BlockOperation()
for i in 0..<10 {
operation.addExecutionBlock({
print("===\(Thread.current)===\(i)"
})
}
queue.addOperation(operation)
I create a asynchronous queue to execute these operations.
And if I use codes like following:
let queue = OperationQueue()
for i in 0..<10 {
queue.addOperation(
print("===\(Thread.current)===\(i)"
)
}
When I make the queue concurrent,they produce the same result.
But when I set
queue.maxConcurrentOperationCount = 1
to make the queue serial, they are different!
The first one still print the unordered result like the concurrent queue. But the second one can print the ordered result.
So what's the difference between them? When I want to use NSOperation, which one should I use? Any help much appreciated!
The docs on OperationQueue tell you about concurrency and order of execution of the blocks you submit. You should read the Xcode article on OperationQueue. Here is a relevant bit:
An operation queue executes its queued operation objects based on
their priority and readiness. If all of the queued operation objects
have the same priority and are ready to execute when they are put in
the queue—that is, their isReady method returns true—they are executed
in the order in which they were submitted to the queue. However, you
should never rely on queue semantics to ensure a specific execution
order of operation objects. Changes in the readiness of an operation
can change the resulting execution order. If you need operations to
execute in a specific order, use operation-level dependencies as
defined by the Operation class.
Check please the official documentation regarded addExecutionBlock: function. It just adds the specified block to the receiver's list of blocks to perform in context of executing operation.
If you would like to do it synchronously, here is a code sample:
let queue = OperationQueue()
queue.maxConcurrentOperationCount = 1
for i in 0..<10 {
let operation = BlockOperation {
print("===\(Thread.current)===\(i)")
}
queue.addOperation(operation)
}
When I want to use NSOperation, which one should I use?
Use the second one.
Just a guess.
In this case:
let queue = OperationQueue()
let operation = BlockOperation()
for i in 0..<10 {
operation.addExecutionBlock({
print("===\(Thread.current)===\(i)"
})
}
queue.addOperation(operation)
Inside the BlockOperation, blocks are asynchronous while the BlockOperation itself
is synchronous. So it actually is a synchronous queue.
So the use of queue.addOperation(operation) is nonsense. Instead of it,
I should use operation.start() because this is a synchronous queue.
The function addExecutionBlock() should be used when you need a synchronous queue.
The function addOperation() should be used when you need a asynchronous queue.
Difference -> BlockOperation has a addDependency whereas OperationQueue() needs to addOperations. Following code with console output will elaborate:
let opQueue = OperationQueue()
opQueue.addOperation {
print("operation 1")
}
let operation2 = BlockOperation {
print("operation 2")
}
let operation3 = BlockOperation {
print("operation 3")
}
operation2.addDependency(operation3)
opQueue.addOperation(operation2)
opQueue.addOperation(operation3)
Console output:
operation 1
operation 3
operation 2

Mutex alternatives in swift

I have a shared-memory between multiple threads. I want to prevent these threads access this piece of memory at a same time. (like producer-consumer problem)
Problem:
A thread add elements to a queue and another thread reads these elements and delete them. They shouldn't access the queue simultaneously.
One solution to this problem is to use Mutex.
As I found, there is no Mutex in Swift. Is there any alternatives in Swift?
There are many solutions for this but I use serial queues for this kind of action:
let serialQueue = DispatchQueue(label: "queuename")
serialQueue.sync {
//call some code here, I pass here a closure from a method
}
Edit/Update: Also for semaphores:
let higherPriority = DispatchQueue.global(qos: .userInitiated)
let lowerPriority = DispatchQueue.global(qos: .utility)
let semaphore = DispatchSemaphore(value: 1)
func letUsPrint(queue: DispatchQueue, symbol: String) {
queue.async {
debugPrint("\(symbol) -- waiting")
semaphore.wait() // requesting the resource
for i in 0...10 {
print(symbol, i)
}
debugPrint("\(symbol) -- signal")
semaphore.signal() // releasing the resource
}
}
letUsPrint(queue: lowerPriority, symbol: "Low Priority Queue Work")
letUsPrint(queue: higherPriority, symbol: "High Priority Queue Work")
RunLoop.main.run()
Thanks to beshio's comment, you can use semaphore like this:
let semaphore = DispatchSemaphore(value: 1)
use wait before using the resource:
semaphore.wait()
// use the resource
and after using release it:
semaphore.signal()
Do this in each thread.
As people commented (incl. me), there are several ways to achieve this kind of lock. But I think dispatch semaphore is better than others because it seems to have the least overhead. As found in Apples doc, "Replacing Semaphore Code", it doesn't go down to kernel space unless the semaphore is already locked (= zero), which is the only case when the code goes down into the kernel to switch the thread. I think that semaphore is not zero most of the time (which is of course app specific matter, though). Thus, we can avoid lots of overhead.
One more comment on dispatch semaphore, which is the opposite scenario to above. If your threads have different execution priorities, and the higher priority threads have to lock the semaphore for a long time, dispatch semaphore may not be the solution. This is because there's no "queue" among waiting threads. What happens at this case is that higher priority
threads get and lock the semaphore most of the time, and lower priority threads can lock the semaphore only occasionally, thus, mostly just waiting. If this behavior is not good for your application, you have to consider dispatch queue instead.
You can use NSLock or NSRecursiveLock. If you need to call one locking function from another locking function use recursive version.
class X {
let lock = NSLock()
func doSome() {
lock.lock()
defer { lock.unlock() }
//do something here
}
}

How does a serial queue/private dispatch queue know when a task is complete?

(Perhaps answered by How does a serial dispatch queue guarantee resource protection? but I don't understand how)
Question
How does gcd know when an asynchronous task (e.g. network task) is finished? Should I be using dispatch_retain and dispatch_release for this purpose? Update: I cannot call either of these methods with ARC... What do?
Details
I am interacting with a 3rd party library that does a lot of network access. I have created a wrapper via a small class that basically offers all the methods i need from the 3rd party class, but wraps the calls in dispatch_async(serialQueue) { () -> Void in (where serialQueue is a member of my wrapper class).
I am trying to ensure that each call to the underlying library finishes before the next begins (somehow that's not already implemented in the library).
The serialisation of work on a serial dispatch queue is at the unit of work that is directly submitted to the queue. Once execution reaches the end of the submitted closure (or it returns) then the next unit of work on the queue can be executed.
Importantly, any other asynchronous tasks that may have been started by the closure may still be running (or may not have even started running yet), but they are not considered.
For example, for the following code:
dispatch_async(serialQueue) {
print("Start")
dispatch_async(backgroundQueue) {
functionThatTakes10Seconds()
print("10 seconds later")
}
print("Done 1st")
}
dispatch_async(serialQueue) {
print("Start")
dispatch_async(backgroundQueue) {
functionThatTakes10Seconds()
print("10 seconds later")
}
print("Done 2nd")
}
The output would be something like:
Start
Done 1st
Start
Done 2nd
10 seconds later
10 seconds later
Note that the first 10 second task hasn't completed before the second serial task is dispatched. Now, compare:
dispatch_async(serialQueue) {
print("Start")
dispatch_sync(backgroundQueue) {
functionThatTakes10Seconds()
print("10 seconds later")
}
print("Done 1st")
}
dispatch_async(serialQueue) {
print("Start")
dispatch_sync(backgroundQueue) {
functionThatTakes10Seconds()
print("10 seconds later")
}
print("Done 2nd")
}
The output would be something like:
Start
10 seconds later
Done 1st
Start
10 seconds later
Done 2nd
Note that this time because the 10 second task was dispatched synchronously the serial queue was blocked and the second task didn't start until the first had completed.
In your case, there is a very good chance that the operations you are wrapping are going to dispatch asynchronous tasks themselves (since that is the nature of network operations), so a serial dispatch queue on its own is not enough.
You can use a DispatchGroup to block your serial dispatch queue.
dispatch_async(serialQueue) {
let dg = dispatch_group_create()
dispatch_group_enter(dg)
print("Start")
dispatch_async(backgroundQueue) {
functionThatTakes10Seconds()
print("10 seconds later")
dispatch_group_leave(dg)
}
dispatch_group_wait(dg)
print("Done")
}
This will output
Start
10 seconds later
Done
The dg.wait() blocks the serial queue until the number of dg.leave calls matches the number of dg.enter calls. If you use this technique then you need to be careful to ensure that all possible completion paths for your wrapped operation call dg.leave. There are also variations on dg.wait() that take a timeout parameter.
As mentioned before, DispatchGroup is a very good mechanism for that.
You can use it for synchronous tasks:
let group = DispatchGroup()
DispatchQueue.global().async(group: group) {
syncTask()
}
group.notify(queue: .main) {
// done
}
It is better to use notify than wait, as wait does block the current thread, so it is safe on non-main threads.
You can also use it to perform async tasks:
let group = DispatchGroup()
group.enter()
asyncTask {
group.leave()
}
group.notify(queue: .main) {
// done
}
Or you can even perform any number of parallel tasks of any synchronicity:
let group = DispatchGroup()
group.enter()
asyncTask1 {
group.leave()
}
group.enter() //other way of doing a task with synchronous API
DispatchQueue.global().async {
syncTask1()
group.leave()
}
group.enter()
asyncTask2 {
group.leave()
}
DispatchQueue.global().async(group: group) {
syncTask2()
}
group.notify(queue: .main) {
// runs when all tasks are done
}
It is important to note a few things.
Always check if your asynchronous functions call the completion callback, sometimes third party libraries forget about that, or cases when your self is weak and nobody bothered to check if the body got evaluated when self is nil. If you don't check it then you can potentially hang and never get the notify callback.
Remember to perform all the needed group.enter() and group.async(group: group) calls before you call the group.notify. Otherwise you can get a race condition, and the group.notify block can fire, before you actually finish your tasks.
BAD EXAMPLE
let group = DispatchGroup()
DispatchQueue.global().async {
group.enter()
syncTask1()
group.leave()
}
group.notify(queue: .main) {
// Can run before syncTask1 completes - DON'T DO THIS
}
The answer to the question in your questions body:
I am trying to ensure that each call to the underlying library finishes before the next begins
A serial queue does guarantee that the tasks are progressed in the order you add them to the queue.
I do not really understand the question in the title though:
How does a serial queue ... know when a task is complete?

Resources