Can I use pthread_join() to check for terminated thread? - pthreads

I need to know if some thread already terminated (if it's not, I must wait for it).
If I call pthread_join() on terminated thread, it always returns success in my version of glibc.
But documentation for pthread_join() says that it must return error with code ESRCH if thread already terminated.
If I call pthread_kill(thread_id, 0) it returns with error code ESRCH (as expected).
Inside glibc sources I see that inside pthread_join() there is simple checking for valid thread_id, but not real checking if thread exist.
And inside pthread_kill() there is real checking (in some kernel's list).
There is my test program:
#include <errno.h>
#include <pthread.h>
#include <signal.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
void * thread_func(void *arg)
{
printf("Hello! I`m thread_func!\nGood-bye!\n");
return NULL;
}
int main(void)
{
int res;
pthread_t thread_id;
printf("Hello from main()!\n");
pthread_create(&thread_id, NULL, thread_func, NULL);
printf("Waiting...\n");
sleep(3);
res = pthread_join(thread_id, NULL);
printf("pthread_join() returned %d (%s)\n", res, strerror(res));
res = pthread_kill(thread_id, 0);
printf("pthread_kill() returned %d (%s)\n", res, strerror(res));
return 0;
}
It's output:
Hello!
Waiting...
Hello! I`m thread_func!
Good-bye!
pthread_join() returned 0 (Success)
pthread_kill() returned 3 (No such process)
My question: is it safe to use pthread_join() to check for terminated threads or I must always use pthread_kill()?

When a thread exits, the code for it stops running but its "corpse" is left lying around, for the return code to be collected by the parent.(1)
So, even though you think the thread has totally disappeared, that's not actually the case.
A call to pthread_join will examine said corpse for a return code so that the parent is notified as to how things turned out. After that's been collected, the thread can be truly laid to rest.(2)
That's why pthread_join() is returning a success code and pthread_kill is not - you're not allowed to kill a thread that's already dead, but you are allowed to join to one that's dead but still warm :-)
You may be better educated by trying the following code, which tries to join to the thread twice:
#include <errno.h>
#include <pthread.h>
#include <signal.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
void * thread_func(void *arg) {
printf("Hello! I`m thread_func!\nGood-bye!\n");
return NULL;
}
int main(void) {
int res;
pthread_t thread_id;
printf("Hello from main()!\n");
pthread_create(&thread_id, NULL, thread_func, NULL);
printf("Waiting...\n");
sleep(3);
res = pthread_join(thread_id, NULL);
printf("pthread_join() returned %d (%s)\n", res, strerror(errno));
res = pthread_join(thread_id, NULL);
printf("pthread_join() returned %d (%s)\n", res, strerror(errno));
return 0;
}
On my system, I see:
Hello from main()!
Waiting...
Hello! I`m thread_func!
Good-bye!
pthread_join() returned 0 (No error)
pthread_join() returned 3 (No error)
In other words, though the thread is dead, the first pthread_join() works.
(1) You can pthread_detach a thread so that its resources are immediately released on termination, if you so wish. That would be along the lines of:
pthread_create(&thread_id, NULL, thread_func, NULL);
pthread_detach(thread_id);
but I'm pretty certain then a join will fail in that case even if the thread is still alive.
To see if a thread is still running regardless of whether it's detached or not, you can just use:
if (pthread_kill(thread_id, 0) != 0)
// Thread is gone.
(2) Apologies for the morbid tones of this answer, I'm feeling a little dark today :-)

pthread_join end the thread's uses of resources. It returns 0 when the thread has come to an end point and is ready to be cleaned up. Threads do not 'go away' all by themselves by default.
Returning zero means:
1. the thread got cleaned up
2. the thread WAS still there waiting
So no, do not use pthread_kill, you have a major assumption that is wrong: threads, unless set to be non-joinable do not exit and cleanup stack and memory resources when the thread returns. In other words, return NULL in you example di NOT terminate the thread. pthread_join did.
So, yes, use pthread_join to wait for a thread to complete.

Related

pthread_cond_signal does not wake up the thread that waits

I am writing a program which creates a thread that prints 10 numbers. When it prints 5 of them, it waits and it is notifying the main thread and then it continues for the next 5 numbers
This is test.c
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <time.h>
#include <pthread.h>
#include <unistd.h>
int rem = 10;
int count = 5;
pthread_mutex_t mtx;
pthread_cond_t cond1;
pthread_cond_t cond2;
void *f(void *arg)
{
int a;
srand(time(NULL));
while (rem > 0) {
a = rand() % 100;
printf("%d\n",a);
rem--;
count--;
if (count==0) {
printf("time to wake main thread up\n");
pthread_cond_signal(&cond1);
printf("second thread waits\n");
pthread_cond_wait(&cond2, &mtx);
printf("second thread woke up\n");
}
}
pthread_exit(NULL);
}
int main()
{
pthread_mutex_init(&mtx, 0);
pthread_cond_init(&cond1, 0);
pthread_cond_init(&cond2, 0);
pthread_t tids;
pthread_create(&tids, NULL, f, NULL);
while(1) {
if (count != 0) {
printf("main: waiting\n");
pthread_cond_wait(&cond1, &mtx);
printf("5 numbers are printed\n");
printf("main: waking up\n");
pthread_cond_signal(&cond2);
break;
}
pthread_cond_signal(&cond2);
if (rem == 0) break;
}
pthread_join(tids, NULL);
}
The output of the program is:
main: waiting
//5 random numbers
time to wake main thread up
second thread waits
5 numbers are printed
main: waking up
Since I do pthread_cond_signal(&cond2);I thought that the thread will wake up and prints the rest numbers but this is not the case. Any ideas why? Thanks in advance.
Summary
The issues have been summarized in comments, or at least most of them. So as to put an actual answer on record, however:
Pretty much nothing about the program's use of shared variables and synchronization objects is correct. It's behavior is undefined, and the specific manifestation observed is just one of the more likely in a universe of possible behaviors.
Accessing shared variables
If two different threads access (read or write) the same non-atomic object during their runs, and at least one of the accesses is a write, then all accesses must be properly protected by synchronization actions.
There is a variety of these, too large to cover comprehensively in a StackOverflow answer, but among the most common is to use a mutex to guard access. In this approach, a mutex is created in the program and designated for protecting access to one or more shared variables. Each thread that wants to access one of those variables locks the mutex before doing so. At some later point, the thread unlocks the mutex, lest other threads be permanently blocked from locking the mutex themselves.
Example:
pthread_mutex_t mutex; // must be initialized before use
int shared_variable;
// ...
void *thread_one_function(void *data) {
int rval;
// some work ...
rval = pthread_mutex_lock(&mutex);
// check for and handle lock failure ...
shared_variable++;
// ... maybe other work ...
rval = pthread_mutex_unlock(&mutex);
// check for and handle unlock failure ...
// more work ...
}
In your program, the rem and count variables are both shared between threads, and access to them needs to be synchronized. You already have a mutex, and using it to protect accesses to these variables looks like it would be appropriate.
Using condition variables
Condition variables have that name because they are designed to support a specific thread interaction pattern: that one thread wants to proceed past a certain point only if a certain condition, which depends on actions performed by other threads, is satisfied. Such requirements arise fairly frequently. It is possible to implement this via a busy loop, in which the thread repeatedly tests the condition (with proper synchronization) until it is true, but this is wasteful. Condition variables allow such a thread to instead suspend operation until a time when it makes sense to check the condition again.
The correct usage pattern for a condition variable should be viewed as a modification and specialization of the busy loop:
the thread locks a mutex guarding the data on which the condition is to be computed;
the thread tests the condition;
if the condition is satisfied then this procedure ends;
otherwise, the thread waits on a designated condition variable, specifying the (same) mutex;
when the thread resumes after its wait, it loops back to (2).
Example:
pthread_cond_t cv; // must be initialized before use
void *thread_two_function(void *data) {
int rval;
// some work ...
rval = pthread_mutex_lock(&mutex);
// check for and handle lock failure ...
while (shared_variable < 5) {
rval = pthread_cond_wait(&cv, &mutex);
// check for and handle wait failure ...
}
// ... maybe other work ...
rval = pthread_mutex_unlock(&mutex);
// check for and handle unlock failure ...
// more work ...
}
Note that
the procedure terminates (at (3)) with the thread still holding the mutex locked. The thread has an obligation to unlock it, but sometimes it will want to perform other work under protection of that mutex first.
the mutex is automatically released while the thread is waiting on the CV, and reacquired before the thread returns from the wait. This allows other threads the opportunity to access shared variables protected by the mutex.
it is required that the thread calling pthread_cond_wait() have the specified mutex locked. Otherwise, the call provokes undefined behavior.
this pattern relies on threads to signal or broadcast to the CV at appropriate times to notify any then-waiting other threads that they might want to re-evaluate the condition for which they are waiting. That is not modeled in the examples above.
multiple CVs can use the same mutex.
the same CV can be used in multiple places and with different associated conditions. It makes sense to do this when all the conditions involved are affected by the same or related actions by other threads.
condition variables do not store signals. Only threads that are already blocked waiting for the specified CV are affected by a pthread_cond_signal() or pthread_cond_broadcast() call.
Your program
Your program has multiple problems in this area, among them:
Both threads access shared variables rem and count without synchronization, and some of the accesses are writes. The behavior of the whole program is therefore undefined. Among the common manifestations would be that the threads do not observe each other's updates to those variables, though it's also possible that things seem to work as expected. Or anything else.
Both threads call pthread_cond_wait() without holding the mutex locked. The behavior of the whole program is therefore undefined. "Undefined" means "undefined", but it is plausible that the UB would manifest as, for example, one or both threads failing to return from their wait after the CV is signaled.
Neither thread employs the standard pattern for CV usage. There is no clear associated condition for either one, and the threads definitely don't test one. That leaves an implied condition of "this CV has been signaled", but that is unsafe because it cannot be tested before waiting. In particular, it leaves open this possible chain of events:
The main thread blocks waiting on cond1.
The second thread signals cond1.
The main thread runs all the way at least through signaling cond2 before the second thread proceeds to waiting on cond2.
Once (3) occurs, the program cannot avoid deadlock. The main thread breaks from the loop and tries to join the second thread, meanwhile the second thread reaches its pthread_cond_wait() call and blocks awaiting a signal that will never arrive.
That chain of events can happen even if the issues called out in the previous points is corrected, and it could manifest exactly the observable behavior you report.

Multithreading in Ubuntu 11.10 Vmware

I'm writing a multithread program in C++ using "pthread" libraries, but when I come to execute it on Ubuntu Virtual machine, my threads don't seem to run in parallel although I have a multicore processor (i7-2630QM)... the code is too long so I'm going to explain my problem with this simple code:
#include <iostream>
#include <pthread.h>
using namespace std;
void* myfunction(void* arg); //function that the thread will execute
int main()
{
pthread_t thread;
pthread_create(&thread, NULL, myfunction, NULL); //thread created
for (int i=0; i<10; i++) //show "1" 10 times
cout << "1";
pthread_join(thread, NULL); //wait for the thread to finish executing
return 0;
}
void* myfunction(void* arg)
{
for (int j=0; j<10; j++) //show "2" 10 times
cout << "2";
return NULL;
}
When I run this code on my host OS (Windows 7 with VC++2010) I get a result like 12212121211121212..., which is what a multithread app supposed to do, but when I run the same code on the guest OS (Ubuntu on Vmware with Code::Blocks) I always get 11111111112222222222 !!!
AFAIK the thread should run in parallel with the main() function, not sequentially. My VM's core number is set to 4, but it seems that the program use only one core, I don't know what is wrong? is it the code or ...? am I missing something here?
I appreciate any help, thanks in advance and please forgive my english :/
Use a semaphore (global or passed as a parameter to my function) to synchronise the threads properly. You're probably running into timing issues due to the different scheduling characteristics between the OSes (it's not really an issue per se). Have the first thread wait on the semaphore after the call to pthread_create, and have the new thread signal it as soon as myfunction is entered.
Also, your loops are quite short, make them take longer.
example here
Windows and Linux CRT/STDC++ libs have different synchronization behaviors. You can't learn much of anything about parallel execution with calls to cout. Write some actual parallel computation and measure elapsed time to tell what's going on.

Why the child thread block in my code?

I was trying to test the pthread by using mutex:
#include <stdio.h>
#include <pthread.h>
#include <sys/types.h>
#include <stdlib.h>
#include <unistd.h>
pthread_mutex_t mutex1 = PTHREAD_MUTEX_INITIALIZER;
int global = 0;
void thread_set();
void thread_read();
int main(void){
pthread_t thread1, thread2;
int re_value1, re_value2;
int i;
for(i = 0; i < 5; i++){
re_value1 = pthread_create(&thread1,NULL, (void*)&thread_set,NULL);
re_value2 = pthread_create(&thread2,NULL,(void*)&thread_read,NULL);
}
pthread_join(thread1,NULL);
pthread_join(thread2,NULL);
/* sleep(2); */ // without it the 5 iteration couldn't finish
printf("exsiting\n");
exit(0);
}
void thread_set(){
pthread_mutex_lock(&mutex1);
printf("Setting data\t");
global = rand();
pthread_mutex_unlock(&mutex1);
}
void thread_read(){
int data;
pthread_mutex_lock(&mutex1);
data = global;
printf("the value is: %d\n",data);
pthread_mutex_unlock(&mutex1);
}
without the sleep(), the code won't finish the 5 iteration:
Setting data the value is: 1804289383
the value is: 1804289383
Setting data the value is: 846930886
exsiting
Setting data the value is: 1804289383
Setting data the value is: 846930886
the value is: 846930886
exsiting
It works only add the sleep() to the main thread, I think it should work without the sleep(), because the join() function wait for each child thread terminate
Any one can tell me why is it?
Your use of mutex objects looks fine, but this loop
for(i = 0; i < 5; i++) {
re_value1 = pthread_create(&thread1,NULL, (void*)&thread_set,NULL);
re_value2 = pthread_create(&thread2,NULL,(void*)&thread_read,NULL);
}
is asking for trouble as you are reusing the same thread instances thread1 and thread2 for each iteration of your loop. Internally this must be causing problems although I do not know exactly how it would manifest itself. You should really use a separate instance of thread object per thread you want to ensure reliable running. I do not know what would happen if you called pthread_create using an instance of a thread object that is already running but I suspect it is not a wise thing to do. I would suspect that at best it will block until the thread function has exited.
Also you are not cheking the return values from pthread_create() either which might be a good idea. In summary I would use a separate instance of thread objects, or add the pthread_join calls to the inside of your loop so that you are certain that the threads are finished running before the next call to pthread_create().
Finally the function signature for functions passed to pthread_create() are of the type
void* thread_function(void*);
and not
void thead_function()
like you have in your code.
You are creating 10 threads (5 iterations of two threads each), but only joining the last two you create (as mathematician1975 notes, you're re-using the thread handle variables, so the values from the last iteration are the only ones you have available to join on). Without the sleep(), it's quite possible that the scheduler has not started executing the first 8 threads before you hit exit(), which automatically terminates all threads, whether they have had a chance to run yet or not.

Odd behavior when creating and cancelling a thread in close succession

I'm using g++ version 4.4.3 (Ubuntu 4.4.3-4ubuntu5) and libpthread v. 2-11-1. The following code simply creates a thread running Foo(), and immediately cancels it:
void* Foo(void*){
printf("Foo\n");
/* wait 1 second, e.g. using nanosleep() */
return NULL;
}
int main(){
pthread_t thread;
int res_create, res_cancel;
printf("creating thread\n);
res_create = pthread_create(&thread, NULL, &Foo, NULL);
res_cancel = pthread_cancel(thread);
printf("cancelled thread\n);
printf("create: %d, cancel: %d\n", res_create, res_cancel);
return 0;
}
The output I get is:
creating thread
Foo
Foo
cancelled thread
create: 0, cancel: 0
Why the second Foo output? Am I abusing the pthread API by calling pthread_cancel right after pthread_create? If so, how can I know when it's safe to touch the thread? If I so much as stick a printf() between the two, I don't have this problem.
I cannot reproduce this on a slightly newer Ubuntu. Sometimes I get one Foo and sometimes none. I had to fix a few things to get your code to compile (missing headers, missing call to some sleep function implied by a comment and string literals not closed), which indicate you did not paste the actual code which reproduced the problem.
If the problem is indeed real, it might indicate some thread cancellation problem in glibc's IO library. It looks a lot like two threads doing a flush(stdout) on the same buffer contents. Now that should never happen normally because the IO library is thread safe. But what if there is some cancellation scenario like: the thread has the mutex on stdout, and has just done a flush, but has not updated the buffer yet to clear the output. Then it is canceled before it can do that, and the main thread flushes the same data again.

Locking mutex in one thread and unlocking it in the other

Will this code be correct and portable?
void* aThread(void*)
{
while(conditionA)
{
pthread_mutex_lock(mutex1);
//do something
pthread_mutex_unlock(mutex2);
}
}
void* bThread(void*)
{
while(conditionB)
{
pthread_mutex_lock(mutex2);
//do something
pthread_mutex_unlock(mutex1);
}
}
In the actual application I have three threads - two for adding values to an array and one for reading them. And I need the third thread to display the contents of the array right after one of the other threads adds a new item.
It is not. If thread A gets to mutex_unlock(2) before thread B got to mutex_lock(2), you are facing undefined behavior. You must not unlock another thread's mutex either.
The pthread_mutex_lock Open Group Base Specification says so:
If the mutex type is PTHREAD_MUTEX_NORMAL [...] If a thread attempts to unlock a mutex that it has not locked or a mutex which is unlocked, undefined behavior results.
As user562734's answer says, the answer is no - you cannot unlock a thread that was locked by another thread.
To achieve the reader/writer synchronisation you want, you should use condition variables - pthread_cond_wait(), pthread_cond_signal() and related functions.
Some implementations do allow locking and unlocking threads to
be different.
#include <iostream>
#include <thread>
#include <mutex>
using namespace std;
mutex m;
void f() {m.lock(); cout << "f: mutex is locked" << endl;}
void g() {m.unlock(); cout << "g: mutex is unlocked" << endl;}
main()
{
thread tf(f); /* start f */
tf.join(); /* Wait for f to terminate */
thread tg(g); /* start g */
tg.join(); /* Wait for g to terminate */
}
This program prints
f: mutex is locked
g: mutex is unlocked
System is Debian linux, gcc 4.9.1. -std=c++11.

Resources