POSIX threading on ios - ios

I've started experimenting with POSix threads using the ios platform. Coming fro using NSThread it's pretty daunting.
Basically in my sample app I have a big array filled with type mystruct. Every so often (very frequently) I want to perform a task with the contents of one of these structs in the background so I pass it to detachnewthread to kick things off.
I think I have the basics down but Id like to get a professional opinion before I attempt to move on to more complicated stuff.
Does what I have here seem "o.k" and could you point out anything missing that could cause problems? Can you spot any memory management issues etc....
struct mystruct
{
pthread thread;
int a;
long c;
}
void detachnewthread(mystruct *str)
{
// pthread_t thread;
if(str)
{
int rc;
// printf("In detachnewthread: creating thread %d\n", str->soundid);
rc = pthread_create(&str->thread, NULL, DoStuffWithMyStruct, (void *)str);
if (rc){
printf("ERROR; return code from pthread_create() is %d\n", rc);
//exit(-1);
}
}
//
/* Last thing that main() should do */
// pthread_exit(NULL);
}
void *DoStuffWithMyStruct(void *threadid)
{
mystruct *sptr;
dptr = (mystruct *)threadid;
// do stuff with data in my struct
pthread_detach(soundptr->thread);
}

One potential issue would be how the storage for the passed in structure mystruct is created. The lifetime of that variable is very critical to its usage in the thread. For example, if the caller of detachnewthread had that declared on the stack and then returned before the thread finished, it would be undefined behavior. Likewise, if it were dynamically allocated, then it is necessary to make sure it is not freed before the thread is finished.
In response to the comment/question: The necessity of some kind of mutex depends on the usage. For the sake of discussion, I will assume it is dynamically allocated. If the calling thread fills in the contents of the structure prior to creating the "child" thread and can guarantee that it will not be freed until after the child thread exits, and the subsequent access is read/only, then you would not need a mutex to protect it. I can imagine that type of scenario if the structure contains information that the child thread needs for completing its task.
If, however, more than one thread will be accessing the contents of the structure and one or more threads will be changing the data (writing to the structure), then you probably do need a mutex to protect it.

Try using Apple's Grand Central Dispatch (GCD) which will manage the threads for you. GCD provides the capability to dispatch work, via blocks, to various queues that are managed by the system. Some of the queue types are concurrent, serial, and of course the main queue where the UI runs. Based upon the CPU resources at hand, the system will manage the queues and necessary threads to get the work done. A simple example, which shows the how you can nest calls to different queues is like this:
__block MYClass *blockSelf=self;
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
[blockSelf doSomeWork];
dispatch_async(dispatch_get_main_queue(), ^{
[blockSelf.textField setStringValue:#"Some work is done, updating UI"];
});
});
__block MyClass *blockSelf=self is used simply to avoid retain cycles associated with how blocks work.
Apple's docs:
http://developer.apple.com/library/ios/#documentation/Performance/Reference/GCD_libdispatch_Ref/Reference/reference.html
Mike Ash's Q&A blog post:
http://mikeash.com/pyblog/friday-qa-2009-08-28-intro-to-grand-central-dispatch-part-i-basics-and-dispatch-queues.html

Related

What does it mean for something to be thread safe in iOS?

I often come across the key terms "thread safe" and wonder what it means. For example, in Firebase or Realm, some objects are considered "Thread Safe". What exactly does it mean for something to be thread safe?
Thread Unsafe
- If any object is allowed to modify by more than one thread at the same time.
Thread Safe
- If any object is not allowed to modify by more than one thread at the same time.
Generally, immutable objects are thread-safe.
An object is said to be thread safe if more than one thread can call methods or access the object's member data without any issues; an "issue" broadly being defined as being a departure from the behaviour when only accessed from only one thread.
For example an object that contains the code i = i + 1 for a regular integer i would not be thread safe, since two threads might encounter that statement and one thread might read the original value of i, increment it, then write back that single incremented value; all at the same time as another thread. In that way, i would be incremented only once, where it ought to have been incremented twice.
After searching for the answer, I got the following from this website:
Thread safe code can be safely called from multiple threads or concurrent tasks without causing any problems (data corruption, crashing, etc). Code that is not thread safe must only be run in one context at a time. An example of thread safe code is let a = ["thread-safe"]. This array is read-only and you can use it from multiple threads at the same time without issue. On the other hand, an array declared with var a = ["thread-unsafe"] is mutable and can be modified. That means it’s not thread-safe since several threads can access and modify the array at the same time with unpredictable results. Variables and data structures that are mutable and not inherently thread-safe should only be accessed from one thread at a time.
iOS Thread safe
[Atomicity, Visibility, Ordering]
[General lock, mutex, semaphore]
Thread safe means that your program works as expected. It is about multithreading envirompment, where we have a problem with shared resource with problems of Data races and Race Condition[About].
Apple provides us by Synchronization Tools:
Atomicity
Atomic Operations - lock free mechanism which is based on hardware instructions - for example Compare-And-Swap(CAS)[More]...
Objective-C OSAtomic, atomic property attribute[About]
[Swift Atomic Operations]
Visibility
Volatile Variable - read value from memory(no cache)
Objective-C volatile
Ordering
Memory Barriers - guarantees up-to date data[About]
Objective-C OSMemoryBarrier
Find problem in your code
Thread Sanitizer - uses self.recordAndCheckWrite(var) inside to figure out when(timestamp) and who(thread) access to variable
Synchronisation
Locks - thread can get a lock and nobody else access to the resource. NSLock.
Semaphore consists of Threads queue, Counter value and has wait() and signal() api. Semaphore allows a number of threads(Counter value) work with resource at a given moment. DispatchSemaphore, POSIX Semaphore - semaphore_t. App Group allows share POSIX semaphores
Mutex - mutual exclusion, mutually exclusive - is a type of Semaphore(allows several threads) where Thread can acquire it and is able to work with block as a single encroacher, all others thread will be blocked until release. The main different with lock is that mutex also works between processes(not only threads). Also it includes memory barrier.
var lock = os_unfair_lock_s()
os_unfair_lock_lock(&lock)
//critical section
os_unfair_lock_unlock(&lock)
NSLock -POSIX Mutex Lock - pthread_mutex_t, Objective-C #synchronized.
let lock = NSLock()
lock.lock()
//critical section
lock.unlock()
Recursive lock - Lock Reentrance - thread can acquire a lock several times. NSRecursiveLock
Spin lock - waiting thread checks if it can get a lock repeatedly based on polling mechanism. It is useful for small operation. In this case thread is not blocked and expensive operations like context switch is not nedded
[GCD]
Common approach is using custom serial queue with async call - where all access to memory will be done one by one:
serial read and write access
private let queue = DispatchQueue(label: "com.company")
self.queue.async {
//read and write access to shared resource
}
concurrent read and serial write access. when write is oocured - all previous read access finished -> write -> all other reads
private let queue = DispatchQueue(label: "com.company", attributes: .concurrent)
//read
self.queue.sync {
//read
}
//write
self.queue.sync(flags: .barrier) {
//write
}
Operations
[Actors]
actor MyData {
var sharedVariable = "Hello"
}
//using
Task {
await self.myData.sharedVariable = "World"
}
Multi threading:
[Concurrency vs Parallelism]
[Sync vs Async]
[Mutable vs Immutable] [let vs var]
[Swift thread safe Singleton]
[Swift Mutable/Immutable collection]
pthread - POSIX[About] thread
NSThead
To give a simple example. If something is shared across multiple threads without any issues like crash, it is thread-safe. For example, if you have a constant (let value = ["Facebook"]) and it is shared across multiple threads, it is thread safe because it is read-only and cannot be modified. Whereas, if you have a variable (var value = ["Facebook"]), it may cause potential crash or data loss when shared with multiple threads because it's data can be modified.

is there a way to still get which queue I am on instead of dispatch_get_current_queue? [duplicate]

Recently, I had the need for a function that I could use to guarantee synchronous execution of a given block on a particular serial dispatch queue. There was the possibility that this shared function could be called from something already running on that queue, so I needed to check for this case in order to prevent a deadlock from a synchronous dispatch to the same queue.
I used code like the following to do this:
void runSynchronouslyOnVideoProcessingQueue(void (^block)(void))
{
dispatch_queue_t videoProcessingQueue = [GPUImageOpenGLESContext sharedOpenGLESQueue];
if (dispatch_get_current_queue() == videoProcessingQueue)
{
block();
}
else
{
dispatch_sync(videoProcessingQueue, block);
}
}
This function relies on the use of dispatch_get_current_queue() to determine the identity of the queue this function is running on and compares that against the target queue. If there's a match, it knows to just run the block inline without the dispatch to that queue, because the function is already running on it.
I've heard conflicting things about whether or not it was proper to use dispatch_get_current_queue() to do comparisons like this, and I see this wording in the headers:
Recommended for debugging and logging purposes only:
The code must not make any assumptions about the queue returned,
unless it is one of the global queues or a queue the code has itself
created. The code must not assume that synchronous execution onto a
queue is safe from deadlock if that queue is not the one returned by
dispatch_get_current_queue().
Additionally, in iOS 6.0 (but not yet for Mountain Lion), the GCD headers now mark this function as being deprecated.
It sounds like I should not be using this function in this manner, but I'm not sure what I should use in its place. For a function like the above that targeted the main queue, I could use [NSThread isMainThread], but how can I check if I'm running on one of my custom serial queues so that I can prevent a deadlock?
Assign whatever identifier you want using dispatch_queue_set_specific(). You can then check your identifier using dispatch_get_specific().
Remember that dispatch_get_specific() is nice because it'll start at the current queue, and then walk up the target queues if the key isn't set on the current one. This usually doesn't matter, but can be useful in some cases.
This is a very simple solution. It is not as performant as using dispatch_queue_set_specific and dispatch_get_specific manually – I don't have the metrics on that.
#import <libkern/OSAtomic.h>
BOOL dispatch_is_on_queue(dispatch_queue_t queue)
{
int key;
static int32_t incrementer;
CFNumberRef value = CFBridgingRetain(#(OSAtomicIncrement32(&incrementer)));
dispatch_queue_set_specific(queue, &key, value, nil);
BOOL result = dispatch_get_specific(&key) == value;
dispatch_queue_set_specific(queue, &key, nil, nil);
CFRelease(value);
return result;
}

Does #synchronized guarantees for thread safety or not?

With reference to this answer, I am wondering is this correct?
#synchronized does not make any code "thread-safe"
As I tried to find any documentation or link to support this statement, for no success.
Any comments and/or answers will be appreciated on this.
For better thread safety we can go for other tools, this is known to me.
#synchronized does make code thread safe if it is used properly.
For example:
Lets say I have a class that accesses a non thread safe database. I don't want to read and write to the database at the same time as this will likely result in a crash.
So lets say I have two methods. storeData: and readData on a singleton class called LocalStore.
- (void)storeData:(NSData *)data
{
[self writeDataToDisk:data];
}
- (NSData *)readData
{
return [self readDataFromDisk];
}
Now If I were to dispatch each of these methods onto their own thread like so:
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
[[LocalStore sharedStore] storeData:data];
});
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
[[LocalStore sharedStore] readData];
});
Chances are we would get a crash. However if we change our storeData and readData methods to use #synchronized
- (void)storeData:(NSData *)data
{
#synchronized(self) {
[self writeDataToDisk:data];
}
}
- (NSData *)readData
{
#synchronized(self) {
return [self readDataFromDisk];
}
}
Now this code would be thread safe. It is important to note that if I remove one of the #synchronized statements however the code would no longer be thread safe. Or if I were to synchronize different objects instead of self.
#synchronized creates a mutex lock on the object you are syncrhonizing. So in other words if any code wants to access code in a #synchronized(self) { } block it will have to get in line behind all previous code running within in that same block.
If we were to create different localStore objects, the #synchronized(self) would only lock down each object individually. Does that make sense?
Think of it like this. You have a whole bunch of people waiting in separate lines, each line is numbered 1-10. You can choose what line you want each person to wait in (by synchronizing on a per line basis), or if you don't use #synchronized you can jump straight to the front and skip all the lines. A person in line 1 doesn't have to wait for a person in line 2 to finish, but the person in line 1 does have to wait for everyone in front of them in their line to finish.
I think the essence of the question is:
is the proper use of synchronize able to solve any thread-safe
problem?
Technically yes, but in practice it's advisable to learn and use other tools.
I'll answer without assuming previous knowledge.
Correct code is code that conforms to its specification. A good specification defines
invariants constraining the state,
preconditions and postconditions describing the effects of the operations.
Thread-safe code is code that remains correct when executed by multiple threads. Thus,
No sequence of operations can violate the specification.1
Invariants and conditions will hold during multithread execution without requiring additional synchronization by the client2.
The high level takeaway point is: thread-safe requires that the specification holds true during multithread execution. To actually code this, we have to do just one thing: regulate the access to mutable shared state3. And there are three ways to do it:
Prevent the access.
Make the state immutable.
Synchronize the access.
The first two are simple. The third one requires preventing the following thread-safety problems:
liveness
deadlock: two threads block permanently waiting for each other to release a needed resource.
livelock: a thread is busy working but it's unable to make any progress.
starvation: a thread is perpetually denied access to resources it needs in order to make progress.
safe publication: both the reference and the state of the published object must be made visible to other threads at the same time.
race conditions A race condition is a defect where the output is dependent on the timing of uncontrollable events. In other words, a race condition happens when getting the right answer relies on lucky timing. Any compound operation can suffer a race condition, example: “check-then-act”, “put-if-absent”. An example problem would be if (counter) counter--;, and one of several solutions would be #synchronize(self){ if (counter) counter--;}.
To solve these problems we use tools like #synchronize, volatile, memory barriers, atomic operations, specific locks, queues, and synchronizers (semaphores, barriers).
And going back to the question:
is the proper use of #synchronize able to solve any thread-safe
problem?
Technically yes, because any tool mentioned above can be emulated with #synchronize. But it would result in poor performance and increase the chance of liveness related problems. Instead, you need to use the appropriate tool for each situation. Example:
counter++; // wrong, compound operation (fetch,++,set)
#synchronize(self){ counter++; } // correct but slow, thread contention
OSAtomicIncrement32(&count); // correct and fast, lockless atomic hw op
In the case of the linked question you could indeed use #synchronize, or a GCD read-write lock, or create a collection with lock stripping, or whatever the situation calls for. The right answer depend on the usage pattern. Any way you do it, you should document in your class what thread-safe guarantees are you offering.
1 That is, see the object on an invalid state or violate the pre/post conditions.
2 For example, if thread A iterates a collection X, and thread B removes an element, execution crashes. This is non thread-safe because the client will have to synchronize on the intrinsic lock of X (synchronize(X)) to have exclusive access. However, if the iterator returns a copy of the collection, the collection becomes thread-safe.
3 Immutable shared state, or mutable non shared objects are always thread-safe.
Generally, #synchronized guarantees thread safety, but only when used correctly. It is also safe to acquire the lock recursively, albeit with limitations I detail in my answer here.
There are several common ways to use #synchronized wrong. These are the most common:
Using #synchronized to ensure atomic object creation.
- (NSObject *)foo {
#synchronized(_foo) {
if (!_foo) {
_foo = [[NSObject alloc] init];
}
return _foo;
}
}
Because _foo will be nil when the lock is first acquired, no locking will occur and multiple threads can potentially create their own _foo before the first completes.
Using #synchronized to lock on a new object each time.
- (void)foo {
#synchronized([[NSObject alloc] init]) {
[self bar];
}
}
I've seen this code quite a bit, as well as the C# equivalent lock(new object()) {..}. Since it attempts to lock on a new object each time, it will always be allowed into the critical section of code. This is not some kind of code magic. It does absolutely nothing to ensure thread safety.
Lastly, locking on self.
- (void)foo {
#synchronized(self) {
[self bar];
}
}
While not by itself a problem, if your code uses any external code or is itself a library, it can be an issue. While internally the object is known as self, it externally has a variable name. If the external code calls #synchronized(_yourObject) {...} and you call #synchronized(self) {...}, you may find yourself in deadlock. It is best to create an internal object to lock upon that is not exposed outside of your object. Adding _lockObject = [[NSObject alloc] init]; inside your init function is cheap, easy, and safe.
EDIT:
I still get asked questions about this post, so here is an example of why it is a bad idea to use #synchronized(self) in practice.
#interface Foo : NSObject
- (void)doSomething;
#end
#implementation Foo
- (void)doSomething {
sleep(1);
#synchronized(self) {
NSLog(#"Critical Section.");
}
}
// Elsewhere in your code
dispatch_queue_t queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
Foo *foo = [[Foo alloc] init];
NSObject *lock = [[NSObject alloc] init];
dispatch_async(queue, ^{
for (int i=0; i<100; i++) {
#synchronized(lock) {
[foo doSomething];
}
NSLog(#"Background pass %d complete.", i);
}
});
for (int i=0; i<100; i++) {
#synchronized(foo) {
#synchronized(lock) {
[foo doSomething];
}
}
NSLog(#"Foreground pass %d complete.", i);
}
It should be obvious to see why this happens. Locking on foo and lock are called in different orders on the foreground VS background threads. It's easy to say that this is bad practice, but if Foo is a library, the user is unlikely to know that the code contains a lock.
#synchronized alone doesn't make code thread safe but it is one of the tools used in writing thread safe code.
With multi-threaded programs, it's often the case of a complex structure that you want to be maintained in a consistent state and you want only one thread to have access at a time. The common pattern is to use a mutex to protect a critical section of code where the structure is accessed and/or modified.
#synchronized is thread safe mechanism. Piece of code written inside this function becomes the part of critical section, to which only one thread can execute at a time.
#synchronize applies the lock implicitly whereas NSLock applies it explicitly.
It only assures the thread safety, not guarantees that. What I mean is you hire an expert driver for you car, still it doesn't guarantees car wont meet an accident. However probability remains the slightest.
It's companion in GCD(grand central dispatch) is dispatch_once. dispatch_once does the same work as to #synchronized.
The #synchronized directive is a convenient way to create mutex locks on the fly in Objective-C code.
side-effects of mutex locks:
deadlocks
starvation
Thread safety will depend on usage of #synchronized block.

How can I verify that I am running on a given GCD queue without using dispatch_get_current_queue()?

Recently, I had the need for a function that I could use to guarantee synchronous execution of a given block on a particular serial dispatch queue. There was the possibility that this shared function could be called from something already running on that queue, so I needed to check for this case in order to prevent a deadlock from a synchronous dispatch to the same queue.
I used code like the following to do this:
void runSynchronouslyOnVideoProcessingQueue(void (^block)(void))
{
dispatch_queue_t videoProcessingQueue = [GPUImageOpenGLESContext sharedOpenGLESQueue];
if (dispatch_get_current_queue() == videoProcessingQueue)
{
block();
}
else
{
dispatch_sync(videoProcessingQueue, block);
}
}
This function relies on the use of dispatch_get_current_queue() to determine the identity of the queue this function is running on and compares that against the target queue. If there's a match, it knows to just run the block inline without the dispatch to that queue, because the function is already running on it.
I've heard conflicting things about whether or not it was proper to use dispatch_get_current_queue() to do comparisons like this, and I see this wording in the headers:
Recommended for debugging and logging purposes only:
The code must not make any assumptions about the queue returned,
unless it is one of the global queues or a queue the code has itself
created. The code must not assume that synchronous execution onto a
queue is safe from deadlock if that queue is not the one returned by
dispatch_get_current_queue().
Additionally, in iOS 6.0 (but not yet for Mountain Lion), the GCD headers now mark this function as being deprecated.
It sounds like I should not be using this function in this manner, but I'm not sure what I should use in its place. For a function like the above that targeted the main queue, I could use [NSThread isMainThread], but how can I check if I'm running on one of my custom serial queues so that I can prevent a deadlock?
Assign whatever identifier you want using dispatch_queue_set_specific(). You can then check your identifier using dispatch_get_specific().
Remember that dispatch_get_specific() is nice because it'll start at the current queue, and then walk up the target queues if the key isn't set on the current one. This usually doesn't matter, but can be useful in some cases.
This is a very simple solution. It is not as performant as using dispatch_queue_set_specific and dispatch_get_specific manually – I don't have the metrics on that.
#import <libkern/OSAtomic.h>
BOOL dispatch_is_on_queue(dispatch_queue_t queue)
{
int key;
static int32_t incrementer;
CFNumberRef value = CFBridgingRetain(#(OSAtomicIncrement32(&incrementer)));
dispatch_queue_set_specific(queue, &key, value, nil);
BOOL result = dispatch_get_specific(&key) == value;
dispatch_queue_set_specific(queue, &key, nil, nil);
CFRelease(value);
return result;
}

Threading NSStream

I have a TCP connection that is open continuously for communication with an external device. There is a lot going on in the communication pipe which causes the UI to become unresponsive at times.
I would like to put the communication on a separate thread. I understand detachNewThread and how it calls a #selector. My issue is that I am not sure how this would be used in conjunction with something like NSStream?
Rather than manually creating a thread and managing thread safety issues, you might prefer to use Grand Central Dispatch ('GCD'). That allows you to post blocks — which are packets of code and some state — off to be executed away from the main thread and wherever the OS thinks is most appropriate. If you create a serial dispatch queue you can even be certain that if you post a new block while an old one has yet to finish, the system will wait until it finishes.
E.g.
// you'd want to make this an instance variable in a real program
dispatch_queue_t serialDispatchQueue =
dispatch_queue_create(
"A handy label for debugging",
DISPATCH_QUEUE_SERIAL);
...
dispatch_async(serialDispatchQueue,
^{
NSLog(#"all code in here occurs on the dispatch queue ...");
});
/* lots of other things here */
dispatch_async(serialDispatchQueue,
^{
NSLog(#"... and this won't happen until after everything already dispatched");
});
...
// cleanup, for when you're done
dispatch_release(serialDispatchQueue);
A very quick introduction to GCD is here, Apple's more thorough introduction is also worth reading.

Resources