First question: will ipcs -s display any information about pthread mutexes in use?
I ask in case pthread mutexes are implemented on top of the AIX semaphores or maybe vice versa.
Someone has spotted some semaphores hanging about (using ipcs) and indicated it may be in our library. However we don't use semxxx but use pthread mutexes.
The mutexes are not shared so I'm assuming they die along with the process?
They are separate. Pthreads are implemented a user space library, not as a kernel subsystem like SysV style semaphores are. All pthread concepts are local to that process. Pthreads themselves can map up to kernel threads, but the locking concepts are separate.
Related
For debugging purposes I'd like to see what kind of mutex a certain pthread mutex is.
E.g. if it is PTHREAD_MUTEX_RECURSIVE or PTHREAD_MUTEX_ERRORCHECK etc.
You can obtain this information via pthread_mutexattr_gettype but then you need to have access to the pthread_mutexattr_t but there's no pthread_mutex_get_attr (or even pthread_mutex_get_attr_np) function.
The solution does not need to be portable over operating systems (linux, macos) but preferably work on recent compilers (gcc 8 and later).
I have a MPI/Pthread program in which each MPI process will be running on a separate computing node. Within each MPI process, certain number of Pthreads (1-8) are launched. However, no matter how many Pthreads are launched within a MPI process, the overall performance is pretty much the same. I suspect all the Pthreads are running on the same CPU core. How can I assign threads to different CPU cores?
Each computing node has 8 cores.(two Quad core Nehalem processors)
Open MPI 1.4
Linux x86_64
Questions like this are often dependent on the problem at hand. Most likely, you are running into a resource lock issue (where the threads are competing for a lock) -- this would look like only one core was doing any work, because only one thread can (effectively) do any work at any given time.
Setting CPU affinity for a certain thread is not a good solution. You should allow for the OS scheduler to optimally determine the physical core assignment for a given pthread.
Look at your code and try to figure out where you are locking where you shouldn't be, or if you've come up with a correct parallel solution to the problem at hand. You should also test a version of the program using only pthreads (not MPI) and see if scaling is achieved.
Is contiki scheduler preemptive? Tinyos is not; there's nanork which i'm not sure of what state its development is in.
Contiki supports preemptive threads. Refer to:
https://github.com/contiki-os/contiki/wiki/Multithreading
Contiki-OS for IoT's supports preemptive multi-threading. In contiki, multi-threading is implemented as a library on top of the event-driven kernel for dynamic loading and replacement of individual services. The library can be linked with applications that require multi-threading. Contiki multithreading library is divided in two parts: (i) a platform independent part (2) platform specific. The platform independent part interfaces to the event kernel and the platform specific part of the library implements stack switching and preemption primitives. Contiki uses protothreads for implementing so called multi-threading. Protothreads are designed for severely memory constraint devices because they are stack-less and lightweight. The main features of protothreads are: very small memory overhead (only two bytes per protothread), no extra stack for a thread, highly portable (i.e., they are fully written in C and hence there is no architecture-specific assembly code). Contiki does not allow interrupt handlers to post new events, no process synchronization is provided in Contiki. The interrupt handler (when required) and the reader functions must be synchronized to avoid race condition. Please have a look at the following link [The Ring Buffer Library] also: https://github.com/contiki-os/contiki/wiki/Libraries
It might be worth noting that the port for the most widely used sensor node, the TelosB, does not support preemption.
Can I create pthreads and inside each pthread can I create opencl environment and call the same kernel. What I am trying to do is launch opencl kernels in parallel on the same device. Is this possible?
Thanks for answering.
At first sight this seams unnecessary.
When you launch an OpenCL kernel, using clEnqueueNDRange() API call, you can launch as many kernels as you need; each as its own thread on the same device. The OpenCL Model is that one Context/Command Queue can launch 100 - 1000s of light weight kernel threads on a GPU.
Ya as Tim pointed out, when OpenCL supports so many threads/kernels why would you want to go inside pthreads with opencl. Further threads on the GPU are very light weight as compared to pthreads. Pthreads are costly and involved lot of overhead for context switching which might actually bring down your performance significantly.
But launching many kernels with the same command queue will execute the kernels sequentially. There should be different command queues for each kernel. I believe single context should not be a problem to launch the kernels parallely...
Perhaps the Question isnt that simple to answer... but what is your opinion? Should i either use Non-Blocking approaches (libevent for exampe) or use erlang light weight processes to:
Achieve as much connections as possible at a given amount of RAM
Achieve as much throughput as possible at a given amount of CPU
The background is, that i am planing to code a pub/sub-Server and i cannot decide which approach i should use.
One article about making A Million-user Comet Application with Mochiweb you can read there. But I think stability, flexibility and maintainability will be more important most of time. Keeping this in mind I would not think about anything other than Erlang even there will be some better performing solution.
Under the hood, the Erlang VM uses non-blocking IO. If you Erlang light weight process blocks, the VM does not really do a kernel level thread context switch. Most of the time, it will just wake up another LWP on the same OS thread (thus, its not "blocking" in the right sense of the word).
You can even start the vm using the +A argument and specify how many IO event loop threads you would like to allocate (AFAIK, Node.js is still single-threaded and if a callback function hangs, ur VM is done for)