Is it mandatory to join a non-detached thread? - pthreads

I create a non-detached thread which can be stopped through 2 different methods. One of them (stop()) provides a sync interface by joining to the spawned thread while the other one (stopAsync()) just sets the control variable m_continue to false and doesn't wait for the mentioned thread to finish.
void run()
{
while(m_continue)
{
// Do something
}
}
void start()
{
m_continue = true;
// Spawn a new thread for run()
}
void stop()
{
m_continue = false;
// Join to the spawned thread
}
void stopAsync()
{
m_continue = false;
}
In case of stopAsync() is called, the while loop in the run() method finishes but no one joins to that thread. Is there any resource leak for this case ? Is it compulsory to join a non-detached thread ?
I couldn't figure out from the documentation. The 'Notes' part tells something that I am not able to understand.
A thread may either be joinable or detached. If a thread is joinable, then another thread can call pthread_join(3) to wait for the thread to terminate and fetch its exit status. Only when a terminated joinable thread has been joined are the last of its resources released back to the system. When a detached thread terminates, its resources are automatically released back to the system: it is not possible to join with the thread in order to obtain its exit status. Making a thread detached is useful for some types of daemon threads whose exit status the application does not need to care about. By default, a new thread is created in a joinable state, unless attr was set to create the thread in a detached state (using pthread_attr_setdetachstate(3)).
https://linux.die.net/man/3/pthread_create

Only when a terminated joinable thread has been joined are the last of its resources released back to the system.
The implication of this part of the documentation is that some resources for the thread will remain allocated after it has exited, if you don't join it. So doing this is analogous to calling malloc() without free() - it's not "mandatory" in the sense that it won't cause an immediate problem for your program if you don't, but it does represent a resource leak so if you do this repetitively in a long-running program you can exhaust the relevant resource.
In lieu of joining the thread, your stopAsync() can just call pthread_detach() on the other thread before setting the flag variable.
Note that your m_continue variable should be protected by a synchronisation primitive like a mutex or reader-writer lock.

Related

GCD Serial Queue dispatch async and sync

I have some doubts regarding GCD.
Code snippet 1
serialQ.sync {
print(1)
serialQ.async {
print(2)
}
serialQ.async {
print(3)
}
}
Code snippet 2
serialQ.async {
print(1)
serialQ.async {
print(2)
}
serialQ.sync {
print(3)
}
}
I ran both of them in playground, and found that Code snippet 2 gives deadlock while Code snippet 1 runs fine. I have read a lot about GCD and started playing around with these concepts. Can anyone please provide a detailed explanation for the same ?
PS : serialQ is a serial Queue
According to my understanding,
Serial Queue - generates only one thread at a time, and once that thread is freed up then it is occupied or free to do other tasks
Serial Queue dispatched sync - blocks the caller thread from which the serial queue is dispatched and performs the tasks on that thread.
Serial Queue dispatched async - does'nt not blocks the caller thread, infact it runs on a different thread and keeps the caller
thread running.
But for the above query I am not able to get the proper explanation.
You are calling sync inside a block already executing on the same queue. This will always cause a deadlock. sync is the equivalent of saying “execute this now and wait for it to return.” Since you are already executing on that queue, the queue never becomes available to execute the sync block. It sounds like you’re looking for recursive locks, but that’s not how queues work. It’s also quite arguably an anti-pattern in general. I discuss this more in this answer: How to implement a reentrant locking mechanism in objective-c through GCD?
EDIT: Came back to add some thoughts on your "understandings":
Serial Queue - generates only one thread at a time, and once that thread is freed up then it is occupied or free to do other tasks
A serial queue doesn't "generate" one thread. Queues and threads are different things and have different semantics. A serial queue requires one thread upon which to execute a work item, but there's not a one-to-one relationship between a serial queue and a thread. A thread is a relatively "heavy" resource and a queue is a relatively "light" resource. A single serial queue can execute work items on more than one thread over its lifetime (although never more than one thread at the same time). GCD maintains pools of threads that it uses to execute work items, but that is an implementation detail, and it's not necessary to understand how that's implemented in order to use queues properly.
Serial Queue dispatched sync - blocks the caller thread from which the serial queue is dispatched and performs the tasks on that thread.
A queue (serial or concurrent) is not "dispatched" (sync or otherwise). A work item is, well, enqueued into a queue. That work item will be subsequently executed by an arbitrary thread including, quite possibly, the calling thread. The guarantee is that only one work item enqueued to a given serial queue will be executing (on any thread) at one time.
Serial Queue dispatched async - doesn't block the ~caller~ enqueueing thread, in fact it runs on a different thread and keeps the caller thread running. (minor edits for readability)
This is close, but not quite accurate. It's true that enqueueing a work item onto a serial queue with async does not block the enqueueing thread. It's not necessarily true that the work item is executed by a different thread than the enqueueing thread, although in the common case, that is usually the case.
The thing to know here is that the difference between sync and async is strictly limited to the behavior of the enqueueing thread and has no (guaranteed) bearing or impact on which thread the work item is executed. If you enqueue a work item with sync the enqueueing thread will wait (possibly forever, in the specific case you outlined here) for the work item to complete, whereas if you enqueue a work item with async the enqueueing thread will continue executing.
A sync call, as you point out, blocks the current thread until the block runs. So when you make sync to the same serial queue that you’re currently on, you are blocking the queue, waiting for a block to run on the same queue that you just blocked, resulting in a deadlock.
If you really want to run something synchronously on the current queue, don’t dispatch it with sync at all and just run it directly. E.g.:
serialQ.async {
print(1)
serialQ.async {
print(2)
}
// serialQ.sync { // don't dispatch synchronously to the current serial queue
print(3)
// }
}
Or dispatch asynchronously. E.g.,
serialQ.async {
print(1)
serialQ.async {
print(2)
}
serialQ.async {
print(3)
}
}
Or use a concurrent queue (in which case you have to be careful to make sure you don’t have thread explosion, which could result in deadlock, too). E.g.,
let concurrentQ = DispatchQueue(label: "...", attributes: .concurrent)
concurrentQ.async {
print(1)
concurrentQ.async {
print(2)
}
concurrentQ.sync {
print(3)
}
}

Does async operation in iOS create a new thread internally, and allocate task to it?

Does async operation in iOS, internally create a new thread, and allocate task to it ?
An async operation is capable to internally create a new thread and allocate task to it. But in order for this to happen you need to run an async operation which creates a new thread and allocates task to it. Or in other words: There is no direct correlation.
I assume that by async you mean something like DispatchQueue.main.async { <#code here#> }. This does not create a new thread as main thread should already be present. How and why does this work can be (if oversimplified) explained with an array of operations and an endless loop which is basically what RunLoop is there for. Imagine the following:
Array<Operations> allOperations;
int main() {
bool continueRunning = true;
for(;continueRunning;) {
allOperations.forEach { $0.run(); }
allOperations.clear();
}
return 0;
}
And when you call something like DispatchQueue.main.async it basically creates a new operation and inserts it into allOperations. The same thread will eventually go into a new loop (within for-loop) and call your operation asynchronously. Again keep in mind that this is all over-simplified just to illustrate the idea behind all of it. You can from this also imagine how for instance timers work; the operation will evaluate if current time is greater then the one of next scheduled execution and if so it will trigger the operation on timer. That is also why timers can not be very precise since they depend on rest of execution and thread may be busy.
A new thread on the other hand may be spawned when you create a new queue DispatchQueue(label: "Will most likely run on a new thread"). When(if) exactly will a thread be made is not something that needs to be fixed. It may vary from implementations and systems being run on. The tool will only guarantee to perform what it is designed for but not how it will do it.
And then there is also Thread class which can generate a new thread. But the deal is same as for previous one; it might internally instantly create a new thread or it might do it later, lazily. All it guarantees is that it will work for it's public interface.
I am not saying that these things change over time, implementation or system they run on. I am only saying that they potentially could and they might have had.

Difference between pthread_exit(PTHREAD_CANCELED) and pthread_cancel(pthread_self())

When pthread_exit(PTHREAD_CANCELED) is called I have expected behavior (stack unwinding, destructors calls) but the call to pthread_cancel(pthread_self()) just terminated the thread.
Why pthread_exit(PTHREAD_CANCELED) and pthread_cancel(pthread_self()) differ significantly and the thread memory is not released in the later case?
The background is as follows:
The calls are made from a signal handler and reasoning behind this strange approach is to cancel a thread waiting for the external library semop() to complete (looping around on EINTR I suppose)
I have noticed that calling pthread_cancel from other thread does not work (as if semop was not a cancellation point) but signalling the thread and then calling pthread_exit works but calls the destructor within a signal handler.
pthread_cancel could postpone the action to the next cancellation point.
In terms of thread specific clean-up behaviour there should be no difference between cancelling a thread via pthread_cancel() and exiting a thread via pthread_exit().
POSIX says:
[...] When the cancellation is acted on, the cancellation clean-up handlers for thread shall be called. When the last cancellation clean-up handler returns, the thread-specific data destructor functions shall be called for thread. When the last destructor function returns, thread shall be terminated.
From Linux's man pthread_cancel:
When a cancellation requested is acted on, the following steps occur for thread (in this order):
Cancellation clean-up handlers are popped (in the reverse of the order in which they were pushed) and called. (See pthread_cleanup_push(3).)
Thread-specific data destructors are called, in an unspecified order. (See pthread_key_create(3).)
The thread is terminated. (See pthread_exit(3).)
Referring the strategy to introduce a cancellation point by signalling a thread, I have my doubts this were the cleanest way.
As many system calls return on receiving a signal while setting errno to EINTR, it would be easy to catch this case and simply let the thread end itself cleanly under this condition via pthread_exit().
Some pseudo code:
while (some condition)
{
if (-1 == semop(...))
{ /* getting here on error or signal reception */
if (EINTR == errno)
{ /* getting here on signal reception */
pthread_exit(...);
}
}
}
Turned out that there is no difference.
However some interesting side effects took place.
Operations on std::iostream especially cerr/cout include cancellation points. When the underlying operation is canceled the stream is marked as not good. So you will get no output from any other thread if only one has discovered cancellation on an attempt to print.
So play with pthread_setcancelstate() and pthread_testcancel() or just call cerr.clear() when needed.
Applies to C++ streams only, stderr,stdin seems not be affected.
First of all, there are two things associated to thread which will tell what to do when you call pthread_cancel().
1. pthread_setcancelstate
2. pthread_setcanceltype
first function will tell whether that particular thread can be cancelled or not, and the second function tells when and how that thread should be cancelled, for example, should that thread be terminated as soon as you send cancellation request or it need to wait till that thread reaches some milestone before getting terminated.
when you call pthread_cancel(), thread wont be terminated directly, above two actions will be performed, i.e., checking whether that thread can be cancelled or not, and if yes, when to cancel.
if you disable cancel state, then pthread_cancel() can't terminate that thread, but the cancellation request will stay in a queue waiting for that thread to become cancellable, i.e., at some point of time if you are enabling cancel state, then your cancel request will start working on terminating that thread
whereas if you use pthread_exit(), then the thread will be terminated irrespective to the cancel state and cancel type of that particular thread.
*this is one of the differences between pthread_exit() and pthread_cancel(), there can be few more.

pthreads wait and signal doubts linux

Before pthread wait we lock using a mutex so that some other code might not try to change the condition variable. wait then unlocks the mutex and waits for the signal.
Say, in some other thread i had locked the same mutex and after that, i had used 'signal'. and then unlock thread.
when signal is done, the waiting thread wakes up and aquires the mutex again.
Thread1 Thread2
{ {
lock(mutex); lock(mutex);
wait(mutex); signal(mutex);
unlock(mutex); unlock(mutex);
} }
Say the three thread one statements are enclosed in a while(1) loop. Then assume that thread2 locks the mutex, signals it, and unlocks the mutex. and then doesn't end, but goes to sleep.
So will the value of the condition variable be changed permanently? If three statements of thread one are running in infinite lop, will it never wait and just find that the signal has been given? When wait call returns, does it set the value of the condition variable back to initial value?
If yes, can I use create,destroy or initialize methods on the variables to set the value back? If yes, how? What exactly do these functions do?
Thanks,
pthread_cond_signal() will always wake at least one thread that is currently waiting on that condition variable in pthread_cond_wait(). If the same thread or a different thread then calls pthread_cond_wait() again, it will block and wait for another signal.
This means that pthread condition variables must always be paired with some kind of shared data, protected by the mutex that is held when calling pthread_cond_wait(). Before calling pthread_cond_wait(), the thread must check the shared data to see if the condition it wants to wait for has occurred - if not, it shouldn't wait.
The simplest example of such shared data might be a global flag. In your example:
int flag = 0;
Thread 1 {
pthread_mutex_lock(&mutex);
while (!flag)
pthread_cond_wait(&cond, &mutex);
pthread_mutex_unlock(&mutex);
}
Thread 2 {
pthread_mutex_lock(&mutex);
flag = 1;
pthread_mutex_signal(&cond);
pthread_mutex_unlock(&mutex);
}
You can see here that when the condition is "reset" is entirely under your control - for example you could have Thread 1 set flag = 0; before it calls pthread_mutex_unlock().
The shared state is often more complex than a simple flag - for example you might have a producer thread call pthread_mutex_wait() while there is no room in a shared buffer.

topic regarding POSIX thread

How does a thread knows when to exit?
SITUATION:
-while the main program must wait for the threads to run to completion.
-This can be done by using a prototype function called pthread_join.
-after all, a call to this function waits for the termination of the thread whose id is given by thread itself.
After you have called pthread_join(ptherad_t &var)the main will wait untill all the threads for which you have called join have exited.
once all the threads are completed their tasks,
when it calls pthread_exit(NULL) main will exit.
inside the thread after its task is completed you need to call pthread_exit(NULL) which will stop the excution of the thread.But this is not mandatory and the thread can simply return which means that the thread has completed.
when it(thread) calls pthread_exit(NULL),the calling thread will exit.

Resources