Thread Sanitizer in xcode giving wrong error - ios

func doSomething() -> Int {
var sum = 0
let increaseWork = DispatchWorkItem {
sum = sum + 100 //point 1
}
DispatchQueue.global().async(execute:increaseWork)
increaseWork.wait()
return sum //point 2
}
Thread Sanitizer is saying that there is race condition between point 1 and point 2. But I don't think there is any race condition as increaseWork.wait() is blocking call and it will not pass until the closure is executed.

There could be a race condition between those 2 points. Imagine what happens if you execute the doSomething() function on 2 separate threads like this:
The first thread executes the increaseWork() closure and finishes it. It's at the line with wait right now
The second thread starts executing and hits the async execute instruction
The first thread hits the return instruction as the wait can continue
At the same time, the closure from the second is scheduled and executed
At this point, you can't tell for sure what is executed first: the sum = sum + 100 from the second thread or the return sum from the first one.
The idea is that sum is a shared resource which is not synchronised, so, in theory, that could be a race condition. Even if you took care that such thing does not happen, the Thread Sanitizer detects a possible race condition as it does not know whether you start a single thread or you execute doSomething() function from 100 different threads at the same time.
UPDATE:
As I missed the fact that the sum variable is local, the above explanation does not answer the current question. The scenario described would never take place in the given piece of code.
But, even though the sum is a local variable, due to the fact it is used and retained inside a closure, it will be allocated on heap, rather than on stack. That's because the Swift compiler cannot determine whether the closure will be finished its execution before the doSomething() function return or not.
Why? Because the closure is passed to a constructor which behaves like an #escaping parameter. It implies that there is no guarantee when the closure will be executed, thus all the variables retained by the closure must be allocated on heap for safety. Not knowing when the closure will be executed, the Thread Sanitizer cannot determine that the return sum statement will be indeed executed after the closure finishes.
So even though here we can be sure that no race condition can happen, the Thread Sanitizer raises an alarm, as it cannot determine it cannot happen.

Related

Block Operation - Completion Block returning random results

My block operation completion handler is displaying random results. Not sure why. I've read this and all lessons say it is similar to Dispatch Groups in GCD
Please find my code below
import Foundation
let sentence = "I love my car"
let wordOperation = BlockOperation()
var wordArray = [String]()
for word in sentence.split(separator: " ") {
wordOperation.addExecutionBlock {
print(word)
wordArray.append(String(word))
}
}
wordOperation.completionBlock = {
print(wordArray)
print("Completion Block")
}
wordOperation.start()
I was expecting my output to be ["I", "love", "my", "car"] (it should display all these words - either in sequence or in random order)
But when I run my output is either ["my"] or ["love"] or ["I", "car"] - it prints randomly without all expected values
Not sure why this is happening. Please advice
The problem is that those separate execution blocks may run concurrently with respect to each other, on separate threads. This is true if you start the operation like you have, or even if you added this operation to an operation queue with maxConcurrentOperationCount of 1. As the documentation says, when dealing with addExecutionBlock:
The specified block should not make any assumptions about its execution environment.
On top of this, Swift arrays are not a thread-safe. So in the absence of synchronization, concurrent interaction with a non-thread-safe object may result in unexpected behavior, such as what you’ve shared with us.
If you turn on TSAN, the thread sanitizer, (found in “Product” » “Scheme” » “Edit Scheme...”, or press ⌘+<, and then choose “Run” » “Diagnostics” » “Thread Sanitizer”) it will warn you about the data race.
So, bottom line, the problem isn’t addExecutionBlock, per se, but rather the attempt to mutate the array from multiple threads at the same time. If you used concurrent queue in conjunction with dispatch group, you can experience similar problems (though, like many race conditions, sometimes it is hard to manifest).
Theoretically, one could add synchronization code to your code snippet and that would fix the problem. But then again, it would be silly to try to initiate a bunch of concurrent updates, only to then employ synchronization within that to prevent concurrent updates. It would work, but would be inefficient. You only employ that pattern when the work on the background threads is substantial in comparison to the amount of time spent synchronizing updates to some shared resource. But that’s not the case here.

The time given to every thread whether the same or has a minimum time?

For example, in thread 1 there is executing something and it uses a global variable, but another thread may change this value
thread 1
a = 1;
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
NSLog(#"a = %d", a);
});
thread 2
a = 2;
there are two questions,
if thread 1 executes first and can I assume system will always print a = 1? or system can change to thread 2 halfway and then change to thread 1 and get a = 2?
if I don't put NSLog in dispatch_asyc(), whether this cause different result?
You can't tell or guarantee exactly. You can assign the threads (or queues) priorities but you still don't know exactly what will happen.
Yes, logging will make a difference to the runtime, again you don't know if it might make a difference to the thread management.
So, if you need something to be protected from access by multiple threads then you need to protect it by adding some synchronisation. How you choose to do that depends on what it is and each case needs to be considered separately.

NSCondition, what if no lock when call signal?

From this Apple's document about NSCondition, the usage of NSCondition should be:
Thead 1:
[cocoaCondition lock];
while (timeToDoWork <= 0)
[cocoaCondition wait];
timeToDoWork--;
// Do real work here.
[cocoaCondition unlock];
Thread 2:
[cocoaCondition lock];
timeToDoWork++;
[cocoaCondition signal];
[cocoaCondition unlock];
And in the document of method signal in NSConditon:
You use this method to wake up one thread that is waiting on the condition. You may call this method multiple times to wake up multiple threads. If no threads are waiting on the condition, this method does nothing. To avoid race conditions, you should invoke this method only while the receiver is locked.
My question is:
I don't want the Thread 2 been blocked in any situation, so I removed the lock and unlock call in Thread 2. That is, Thread 2 can put as many work as it wish, Thread 1 will do the work one by one, if no more work, it wait (blocked). This is also a producer-consumer pattern, but the producer never been blocked.
But the way is not correct according to Apple's document So what things could possibly go wrong in this pattern? Thanks.
Failing to lock is a serious problem when multiple threads are accessing shared data. In the example from Apple's code, if Thread 2 doesn't lock the condition object then it can be incrementing timeToDoWork at the same time that Thread 1 is decrementing it. That can result in the results from one of those operations being lost. For example:
Thread 1 reads the current value of timeToDoWork, gets 1
Thread 2 reads the current value of timeToDoWork, gets 1
Thread 2 computes the incremented value (timeToDoWork + 1), gets 2
Thread 1 computes the decremented value (timeToDoWork - 1), gets 0
Thread 2 writes the new value of timeToDoWork, stores 2
Thread 1 writes the new value of timeToDoWork, stores 0
timeToDoWork started at 1, was incremented and decremented, so it should end at 1, but it actually ends at 0. By rearranging the steps, it could end up at 2, instead. Presumably, the value of timeToDoWork represents something real and important. Getting it wrong would probably screw up the program.
If your two threads are doing something as simple as incrementing and decrementing a number, they can do it without locks by using the atomic operation functions, such as OSAtomicIncrement32Barrier() and OSAtomicDecrement32Barrier(). However, if the shared data is any more complicated than that (and it probably is in any non-trivial case), then they really need to use synchronization mechanisms such as condition locks.

pthread_cond_signal() release exactly one thread?

Does pthread_cond_signal unblock exactly one thread? If not, what will be the case it releases more than one thread? The specification says as follows:
The pthread_cond_signal() function shall unblock at least one of the
threads that are blocked on the specified condition variable cond (if
any threads are blocked on cond).
The pthreads specification allows for "spurious wakeups" in an implementation. See, for example, the hypothetical implementation of pthread_cond_signal and pthread_cond_wait sketched in the specification that allows for just this condition.
The possibility of spurious wakeups is why one always associates some predicate with a condition, and checks that predicate upon wakeup.

Mutex are needed to protect the Condition Variables

As it is said that Mutex are needed to protect the Condition Variables.
Is the reference here to the actual condition variable declared as pthread_cond_t
OR
A normal shared variable count whose values decide the signaling and wait.
?
is the reference here to the actual condition variable declared as pthread_cond_t or a normal shared variable count whose values decide the signaling and wait?
The reference is to both.
The mutex makes it so, that the shared variable (count in your question) can be checked, and if the value of that variable doesn't meet the desired condition, the wait that is performed inside pthread_cond_wait() will occur atomically with respect to that check.
The problem being solved with the mutex is that you have two separate operations that need to be atomic:
check the condition of count
wait inside of pthread_cond_wait() if the condition isn't met yet.
A pthread_cond_signal() doesn't 'persist' - if there are no threads waiting on the pthread_cond_t object, a signal does nothing. So if there wasn't a mutex making the two operations listed above atomic with respect to one another, you could find yourself in the following situation:
Thread A wants to do something once count is non-zero
Thread B will signal when it increments count (which will set count to something other than zero)
thread "A" checks count and finds that it's zero
before "A" gets to call pthread_cond_wait(), thread "B" comes along and increments count to 1 and calls pthread_cond_signal(). That call actually does nothing of consequence since "A" isn't waiting on the pthread_cond_t object yet.
"A" calls pthread_cond_wait(), but since condition variable signals aren't remembered, it will block at this point and wait for the signal that has already come and gone.
The mutex (as long as all threads are following the rules) makes it so that item #2 cannot occur between items 1 and 3. The only way that thread "B" will get a chance to increment count is either before A looks at count or after "A" is already waiting for the signal.
A condition variable must always be associated with a mutex, to avoid the race condition where a thread prepares to wait on a condition variable and another thread signals the condition just before the first thread actually waits on it.
More info here
Some Sample:
Thread 1 (Waits for the condition)
pthread_mutex_lock(cond_mutex);
while(i<5)
{
pthread_cond_wait(cond, cond_mutex);
}
pthread_mutex_unlock(cond_mutex);
Thread 2 (Signals the condition)
pthread_mutex_lock(cond_mutex);
i++;
if(i>=5)
{
pthread_cond_signal(cond);
}
pthread_mutex_unlock(cond_mutex);
As you can see in the same above, the mutex protects the variable 'i' which is the cause of the condition. When we see that the condition is not met, we go into a condition wait, which implicitly releases the mutex and thereby allowing the thread doing the signalling to acquire the mutex and work on 'i' and avoid race condition.
Now, as per your question, if the signalling thread signals first, it should have acquired the mutex before doing so, else the first thread might simply check the condition and see that it is not being met and might go for condition wait and since the second thread has already signalled it, no one will signal it there after and the first thread will keep waiting forever.So, in this sense, the mutex is for both the condition & the conditional variable.
Per the pthreads docs the reason that the mutex was not separated is that there is a significant performance improvement by combining them and they expect that because of common race conditions if you don't use a mutex, it's almost always going to be done anyway.
https://linux.die.net/man/3/pthread_cond_wait​
Features of Mutexes and Condition Variables
It had been suggested that the mutex acquisition and release be
decoupled from condition wait. This was rejected because it is the
combined nature of the operation that, in fact, facilitates realtime
implementations. Those implementations can atomically move a
high-priority thread between the condition variable and the mutex in a
manner that is transparent to the caller. This can prevent extra
context switches and provide more deterministic acquisition of a mutex
when the waiting thread is signaled. Thus, fairness and priority
issues can be dealt with directly by the scheduling discipline.
Furthermore, the current condition wait operation matches existing
practice.
I thought that a better use-case might help better explain conditional variables and their associated mutex.
I use posix conditional variables to implement what is called a Barrier Sync. Basically, I use it in an app where I have 15 (data plane) threads that all do the same thing, and I want them all to wait until all data planes have completed their initialization. Once they have all finished their (internal) data plane initialization, then they can start processing data.
Here is the code. Notice I copied the algorithm from Boost since I couldnt use templates in this particular application:
void LinuxPlatformManager::barrierSync()
{
// Algorithm taken from boost::barrier
// In the class constructor, the variables are initialized as follows:
// barrierGeneration_ = 0;
// barrierCounter_ = numCores_; // numCores_ is 15
// barrierThreshold_ = numCores_;
// Locking the mutex here synchronizes all condVar logic manipulation
// from this point until the point where either pthread_cond_wait() or
// pthread_cond_broadcast() is called below
pthread_mutex_lock(&barrierMutex_);
int gen = barrierGeneration_;
if(--barrierCounter_ == 0)
{
// The last thread to call barrierSync() enters here,
// meaning they have all called barrierSync()
barrierGeneration_++;
barrierCounter_ = barrierThreshold_;
// broadcast is the same as signal, but it signals ALL waiting threads
pthread_cond_broadcast(&barrierCond_);
}
while(gen == barrierGeneration_)
{
// All but the last thread to call this method enter here
// This call is blocking, not on the mutex, but on the condVar
// this call actually releases the mutex
pthread_cond_wait(&barrierCond_, &barrierMutex_);
}
pthread_mutex_unlock(&barrierMutex_);
}
Notice that every thread that enters the barrierSync() method locks the mutex, which makes everything between the mutex lock and the call to either pthread_cond_wait() or pthread_mutex_unlock() atomic. Also notice that the mutex is released/unlocked in pthread_cond_wait() as mentioned here. In this link it also mentions that the behavior is undefined if you call pthread_cond_wait() without having first locked the mutex.
If pthread_cond_wait() did not release the mutex lock, then all threads would block on the call to pthread_mutex_lock() at the beginning of the barrierSync() method, and it wouldnt be possible to decrease the barrierCounter_ variables (nor manipulate related vars) atomically (nor in a thread safe manner) to know how many threads have called barrierSync()
So to summarize all of this, the mutex associated with the Conditional Variable is not used to protect the Conditional Variable itself, but rather it is used to make the logic associated with the condition (barrierCounter_, etc) atomic and thread-safe. When the threads block waiting for the condition to become true, they are actually blocking on the Conditional Variable, not on the associated mutex. And a call to pthread_cond_broadcast/signal() will unblock them.
Here is another resource related to pthread_cond_broadcast() and pthread_cond_signal() for an additional reference.

Resources